haystack_continuous_highD.Rd
The main Haystack function, for higher-dimensional spaces and continuous expression levels.
haystack_continuous_highD(
x,
expression,
grid.points = 100,
weights.advanced.Q = NULL,
dir.randomization = NULL,
scale = TRUE,
grid.method = "centroid",
randomization.count = 100,
n.genes.to.randomize = 100,
selection.method.genes.to.randomize = "heavytails",
grid.coord = NULL,
genes.to.randomize = NULL,
spline.method = "ns"
)
Coordinates of cells in a 2D or higher-dimensional space. Rows represent cells, columns the dimensions of the space.
a matrix with expression data of genes (rows) in cells (columns)
An integer specifying the number of centers (grid points) to be used for estimating the density distributions of cells. Default is set to 100.
(Default: NULL) Optional weights of cells for calculating a weighted distribution of expression.
If NULL, no output is made about the random sampling step. If not NULL, files related to the randomizations are printed to this directory.
Logical (default=TRUE) indicating whether input coordinates in x should be scaled to mean 0 and standard deviation 1.
The method to decide grid points for estimating the density in the high-dimensional space. Should be "centroid" (default) or "seeding".
Number of randomizations to use. Default: 100
Number of genes to use in randomizations. Default: 100
Method used to select genes for randomization.
matrix of grid coordinates.
Method to use for fitting splines "ns" (default): natural splines, "bs": B-splines.
An object of class "haystack", including the results of the analysis, and the coordinates of the grid points used to estimate densities.
# using the toy example of the singleCellHaystack package
# running haystack
res <- haystack(dat.tsne, dat.expression)
#> ### calling haystack_continuous_highD()...
#> ### Using package sparseMatrixStats to speed up statistics in sparse matrices.
#> ### Calculating row-wise mean and SD...
#> ### Filtered 0 genes with zero variance...
#> ### Using 100 randomizations...
#> ### Using 100 genes to randomize...
#> Warning: The value of 'grid.points' appears to be very high (> No. of cells / 10). You can set the number of grid points using the 'grid.points' parameter.
#> ### scaling input data...
#> ### deciding grid points...
#> ### calculating Kullback-Leibler divergences...
#> ### performing randomizations...
#> ### estimating p-values...
#> ### picking model for mean D_KL...
#> ### using natural splines
#> ### best RMSD : 0.09
#> ### best df : 3
#> ### picking model for stdev D_KL...
#> ### using natural splines
#> ### best RMSD : 0.018
#> ### best df : 5
#> ### returning result...
# list top 10 biased genes
show_result_haystack(res, n=10)
#> D_KL log.p.vals log.p.adj
#> gene_79 2.405028 -38.82043 -36.12146
#> gene_242 1.850754 -38.06052 -35.36155
#> gene_339 1.940697 -37.15175 -34.45278
#> gene_71 2.673197 -36.08386 -33.38489
#> gene_275 1.841230 -35.32151 -32.62254
#> gene_62 2.175734 -34.89847 -32.19950
#> gene_351 1.906705 -33.77042 -31.07145
#> gene_479 2.455483 -33.13287 -30.43390
#> gene_300 2.230951 -32.52041 -29.82144
#> gene_429 1.705867 -31.71053 -29.01156