dl.helpers package¶
Submodules¶
dl.helpers.all module¶
dl.helpers.cluster module¶
Data Lab helpers for clustering.
-
dl.helpers.cluster.
constructOutlines
(x, y, clusterlabels)[source]¶ Construct the convex hull (outline) of points in (x,y) feature space,
- Parameters
y (x,) – Location of points in (x,y) feature space (e,g, RA & Dec).
- Returns
hull – The convex hull of points (x,y), an instance of
scipy.spatial.qhull.ConvexHull
.- Return type
instance
Example
Given x & y coordinates as 1d sequences:
points = np.vstack((x,y)).T # make 2-d array of correct shape hull = constructOutlines(x,y) plt.plot(points[hull.vertices,0], points[hull.vertices,1], 'r-', lw=2) # plot the hull plt.plot(points[hull.vertices[0],0], points[hull.vertices[0],1], 'r-') # closing last point of the hull
-
dl.helpers.cluster.
findClusters
(x, y, method='MiniBatchKMeans', **kwargs)[source]¶ Find 2D clusters from x & y data.
- Parameters
y (x,) – Location of points in (x,y) feature space, e,g, RA & Dec, but x & y need not be spatial in nature.
method (str) – Cluster finding method from
sklearn.cluster
to use. Default: ‘MiniBatchKMeans’ (a streaming implementation of KMeans), which is very fast, but not the most robust. ‘DBSCAN’ is much more robust, but MUCH slower. For other methods, consultsklearn.cluster
.**kwargs –
Any other keyword arguments will be passed to the cluster finding method. If method=’MiniBatchKMeans’ or ‘KMeans’, n_clusters (integer number of clusters to find) must be passed, e.g.
clusters = findClusters(x,y,method='MiniBatchKMeans',n_clusters=3)
dl.helpers.crossmatch module¶
Data Lab helpers for (local) positional cross-matching.
-
dl.helpers.crossmatch.
xmatch
(ra1, dec1, ra2, dec2, maxdist=None, units='deg', method='astropy', **kwargs)[source]¶ Cross-match two sets of ra & dec coordinates locally (i.e. all coordinates are in RAM).
The function will search for counterparts of ra1/dec1 coordinates in the in ra2/dec2 coordinate set, i.e. one can consider ra2/dec2 to be the catalog that will be searched.
- Parameters
dec1 (ra1,) – RA and declination of first coordinate set, in units of units
dec2 (ra2,) – RA and declination of second coordinate set, in units of units
maxdist (float or None) – If not None, then it is the maximum angular distance (in units of units) to be considered. All distances greater than that will be considered non-matches. If None, then all ra1/dec1 will have matches in ra2/dec2.
units (str) – Units of ra1, dec1, ra2, dec2. Default: ‘deg’ (decimal degrees).
method (str) – Currently only astropy’s
match_to_catalog_sky()
method is supported, i.e. the default ‘astropy’.
- Other Parameters
nthneighbor (int, optional) – If
method='astropy'
. Which closest neighbor to search for. Typically1
is desired here, as that is correct for matching one set of coordinates to another. The next likely use case is2
, for matching a coordinate catalog against itself (1
is inappropriate because each point will find itself as the closest match).- Returns
idx (1-d array) – Index values of the ra1/dec1 counterparts found in ra2/dec2. Thus ra2[idx], dec2[idx] will select from the ra2/dec2 catalog the matched counterparts of the ra1/dec1 coordinate pairs.
If maxdist was not None but a number instead, then ‘idx’ only contains the objects matched up to the maxdist radius.
dist2d (1-d array) – The angular distances of the matches found in the ra2/dec2 catalog. In units of units.
If maxdist was not None but a number instead, then ‘dist2d’ only contains the objects matched up to the maxdist radius.
dl.helpers.legacy module¶
Legacy helpers for Data Lab. Most are deprecated.
-
class
dl.helpers.legacy.
Querist
(username='anonymous')[source]¶ Bases:
object
-
checkAsyncJob
()[source]¶ Check the first async job in the FIFO queue (if queue is not empty).
- Parameters
None –
- Returns
Always returns a 3 tuple. If no async job was in the queue,
returns (None,None,None). If there was an async query in the
queue but its status did not return ‘COMPLETED’, re-inserts
the query at its old position in the queue, and returns
(None,None,None). If the status was ‘COMPLETED’, returns the
tuple (query result,outfmt,preview).
-
property
output_formats
¶ Pretty-print to STDOUT the available outfmt values.
- Parameters
None –
- Returns
- Return type
Nothing
-
dl.helpers.plot module¶
dl.helpers.utils module¶
Data Lab utility helper functions.
-
dl.helpers.utils.
convert
(inp, outfmt='pandas', verbose=False, **kwargs)[source]¶ Convert input inp to a data structure defined by outfmt.
- Parameters
inp (str) – String representation of the result of a query. Usually this is a CSV-formatted string, but can also be, e.g. an XML-formatted votable (as string)
outfmt (str) –
The desired data structure for converting inp to. Default: ‘pandas’, which returns a Pandas dataframe. Other available conversions are:
string - no conversion array - Numpy array structarray - Numpy structured array (also called record array) table - Astropy Table votable - Astropy VOtable
For outfmt=’votable’, the input string must be an XML-formatted string. For all other values, as CSV-formatted string.
verbose (bool) – If True, print status message after conversion. Default: False
kwargs (optional params) – Will be passed as **kwargs to the converter method.
Example
Convert a CSV-formatted string to a Pandas dataframe
arr = convert(inp,'array') arr.shape # arr is a Numpy array df = convert(inp,outfmt='pandas') df.head() # df is as Pandas dataframe, with all its methods df = convert(inp,'pandas',na_values='Infinity') # na_values is a kwarg; adds 'Infinity' to list of values converter to np.inf
-
dl.helpers.utils.
normalizeCoordinates
(x, y, frame_in='icrs', units_in='deg', frame_out=None, wrap_at=180)[source]¶ Makes 2D spatial coordinates (e.g. RA & Dec) suitable for use with matplotlib’s all-sky projection plotting.
- Parameters
y (x,) – Location of points in (x,y) feature space (e,g, RA & Dec in degrees). Avoid supplying x and y as columns from a pandas dataframe, as this unfortunately makes the coordinate conversions much slower. Numpy arrays, lists, astropy table and votable columns, all are fine.
frame_in (str) – Coordinate frame of x & y. Default: ‘icrs’. ‘galactic’ is also available. If the user desires other frames from
astropy.coordinates
, please contact __author__.units_in (str) – Units of x & y. Default ‘deg’ (degrees).
frame_out (None or str) – If not None, and not same as frame_in, the x & y coordinates will be transformed from frame_in to frame_out.
wrap_at (float) –
matplotlib
plotting functions such asmatplotlib.scatter()
with all-sky projections expect the x-coordinate (e.g. RA) to be between -180 and +180 degrees (or more precisely: between -pi and +pi). The default wrap_at=180 shifts the input coordinate x (e.g. RA) accordingly.
-
dl.helpers.utils.
resolve
(name=None)[source]¶ Resolve object name to coordinates.
- Parameters
name (str or None) – If str, it is the name of the object to resolve. If None (default), a primpt for the object name will be presented.
- Returns
sc – Instance of SkyCoord from astropy. Get e.g. RA via sc.ra (with units), or sc.ra.value (without units). Or explictly in a different coordinate system, e.g. sc.galactic.b, etc.
- Return type
instance
-
dl.helpers.utils.
vospace_readable_fileobj
(name_or_obj, token=None, **kwargs)[source]¶ Read data from VOSpace or some other place.
Notes
Most of the heavy lifting is done with
get_readable_fileobj()
. Any additional keywords passed to this function will get passed directly to that function.- Parameters
name_or_obj (
str
or file-like object) –The filename of the file to access (if given as a string), or the file-like object to access.
If a file-like object, it must be opened in binary mode.
token (
str
) – A token granting access to VOSpace.
- Returns
A readable file-like object.
- Return type
file