# 0.0.111 (01.10.2021) - require minimum version of NCLS # 0.0.110 (20.09.21) - fix count_overlaps with keep_nonoverlapping=False - fix subtract with more than 1024 intervals (new fix) # 0.0.109 (16.09.21) - fix overlap invert behavior - add intersect invert flag - fix subtract in cases where more than 1024 intervals overlapped a single interval # 0.0.106/107/108(hotfixes) (07/8.09.21) - fix join with slack mutating first arg - add flag use_other_strand in join, nearest, k_nearest - fix categorical-bug in newer versions of pandas - add function pr.version_info() to print relevant version flags for debugging # 0.0.105 (23.08.21) - require bamread 0.0.10 to fix #211 # 0.0.104 (06/20.08.21) - fix broken three_end/five_end code # 0.0.102/103 (06.08.21) - fix bug in pr.count_overlaps - demand version 0.0.9 or greater from bamread # 0.0.100/0.0.101 (20/21.06.21) - add full-flag to read_gtf - fix bug in join with slack > 0 when result is empty # 0.0.99 (17.06.21) - add nb_cpu arg to overlap # 0.0.98 (07.06.21) - fix k-nearest how=None # 0.0.98 (20.05.21) - fix casting in tss/tes # 0.0.96/97 (07.05.21) - fixes to .tes and .tss methods (issue #182) # 0.0.95 (02.03.21) - teensy fix bedclip - add pretty-printing in jupyter notebooks (thanks to @rasi) # 0.0.94 (27.02.21) - print warning if start and end columns have different dtypes # 0.0.93 (25.02.21) - add max_disjoint for maximal disjoint set # 0.0.91-92 (15.01.21) - hotfix for 0.0.90 # 0.0.90 (03.01.21) - fix #165 slow set operations on small files with many chromosomes (thanks ndukler) # 0.0.89 (16.11.20) - fix #159 (thanks cfriedline) # 0.0.88 (09.11.20) - fix bug when concatting stranded and unstranded pyranges (thanks cfriedline, issue #160) # 0.0.87 (23.10.20) - fix bug in join with left/right option # 0.0.86 (05.10.20) - add slack-option to merge # 0.0.85 (17.09.20) - fix error when parsing gtf-files with whitespace in value-tags # 0.0.84 (18.08.20) - add option to report overlap in join # 0.0.83 (18.08.20) - hotfix # 0.0.82 (18.08.20) - fix error introduced in 0.0.80 # 0.0.81 (13.08.20) - fix Fisher's implementation # 0.0.80 (10.08.20) - fix reassigning chromosomes in apply # 0.0.79 (08.06.20) - fix bug in features.introns where the gene_id column was overwritten (issue #134) # 0.0.78 (18.03.20) - add reader for bigwig (pr.read_bigwig) - fix cluster (allow for multiple by arguments) - optimize to_bigwig slightly - fix: overlap did not recognize invert-argument # 0.0.77 (24.03.20) - add api-docs - make default strandedness of apply-pair equal None - add pr.from_string() to create a PyRanges from a multiline string - remove set_columns, set on .columns directly - apply numpy-methods to pyranges - add pr.get_fasta(gr, path) # 0.0.76 (20.02.20) - fix leftover print in itergrs # 0.0.75 (20.02.20) - reset index when reading pyranges from df - ignore reinit error in ray - did not use copy_df in init # 0.0.74 (12.02.20) - support for multiple (repeating) attributes in gtf reading - fix handling of kwargs in apply, apply_pair, apply_chunks - add to_example(nrows=10) to get a copy-paste friendly representation of a PyRanges - add pr.from_dict() to create a PyRanges from a dict (like the ones produced with to_example) # 0.0.73 (03.02.20) - fix small bug in jaccard - remove leftover debug-print in pr.random() - add experimental gr.stats.forbes - fix handling of kwargs in apply, apply_pair, apply_chunks # 0.0.72 (03.02.20) - random also takes dict as chromsizes argument (like {"chr1": 249, "chr2": 242}) - fix reldist bug when grs have different chromosomes # 0.0.71 (30.01.20) - fix various issues with reading and writing gtf/gff3 (1-indexing, removed final ";" in gff3 attribute col when writing) - remove ModuleNotFoundException in __init__.py (3.5 < only) - gr.overlap(gr2) now has default argument how="first", i.e. only return overlapping intervals once, even though there are multiple overlapping features in gr - fix bug in pr.stats.mcc when using stranded data # 0.0.70 (24.01.20) - add Simes method to pr.stats - add keys argument to pr.iter - make strand=None default arg for concat - gr.split() does opposite of merge - pr.count_overlaps(grs, features=None) like bedtools multiintersect added - set mkl.set_num_threads to 1 in __init__ # 0.0.69 (22.01.20) - add value col argument to to_bigwig (thanks https://github.com/liyao001) # 0.0.68 (21.01.20) - fix regression: slack changes dtype from int32 to int64 # 0.0.67 (10.01.20) - add dtypes attribute to pyranges - fix left and right join when chromosomes missing # 0.0.66 (03.01.20) - add argument sparse to read_bam. Setting it to False fetches more columns. # 0.0.65 (10.12.19) - fix column names after read_gtf so they work with GenomicFeatures - add flag chain, make False by default to to_* methods - genomicfeatures: add tss/tes-methods - fix column names after read_gtf so they work with GenomicFeatures - remove Strand column with unstrand() even if PyRanges is not stranded - reading gff and gtf now consistent and column names from attributes are in lower_snake_case # 0.0.64 (28.11.19) - add missing example data (ending with gz) to pyranges - add rowbased_spearman, rowbased_pearson and rowbased_rankdata to pyranges.stats - pyranges now accept columns with integer names, like pandas # 0.0.63 (14.11.19) - ignore index when inserting Series - able to add dictionary of dfs to a pyranges - remove FDR from fisher_exact, but add fdr as own method in stats - make stats.mcc faster - make stats.mcc work without a genome - fix gff3 reading when metadata contains spaces # 0.0.62 (11.11.19) - fix fisher exact when given pd.Series - fisher_exact: only use pseudocounts for OR # 0.0.61 (11.11.19) - add outer, inner and left join - add fisher exact - insert series/dataframes to pyranges with + operator - gr.Whatever = pd.Series(...) now ignores index - add gr.copy() method to create deep copy # 0.0.60 (10.11.19) - add k-nearest - ensure that start/end have the same dtype after calling slack - breaking change: new_position takes no default arg - new_position takes an argument swap - .length returns a python integer - breaking change: lengths returns a vector by default # 0.0.59 (28.10.19) - fix attributerrors on pyranges (thanks https://github.com/MuhammedHasan) - add reader for gff3 - add writer for gff3 - add count flag to cluster/cluster_by # 0.0.58 (25.10.19) - fix merge print functions - make pickleable - add iter as alias for itergrs in pr. namespace - gr.length() shows nucleotide length (sum of all interval lengths) - gr.lengths() takes as_dict=False flag to return as vector - fix slack in join: added columns when joining with itself - fix print for unstranded pyranges: printed tail and head of first chromosome # 0.0.57 (10.10.19) - add overlap-flag to tile - add chain to print-method - bugfix: printing stranded pyranges sorted output even though sort was false - bugfix: wrong number hidden cols on very small terminal widths - bugfix: unstrand did not change underlying dict to chromosomes only - show number of hidden columns in header - tests: mismatches in strand between dict and dataframes - .df/.as_df() now returns with non-duplicated index # 0).0.56 (25.09.19) - add possibility of 5-end and 3-end in slack being different/none - add slack to join-method - add new_position method to take union or intersection of two pairs of Start/End-columns in pyranges # 0.0.55 (13.09.19) - Add int64-flag to method pr.random. # 0.0.54 (10.09.19) - Ensure that Chromosome and Strand is str dtype before creating category - Add check to ensure that the columns Chromosome, Start and End exist when trying to create a PyRanges # 0.0.53 (02.09.19) - Fix error in pypi file # 0.0.52 (30.08.19) Fixes: - fix creating duplicate indexes in pyrange apply both - fix regression where joining unstranded and stranded pyrange did not make a stranded pyrange - default was strand=False for a few methods, should have been None (i.e. autodetect) - read_bed now handles gzipped bed (if the file has the .gz extension) - now able to print untraditional strands which are not strings - fix drop when "Strand" is part of what is to be dropped - more robust checking if column is in gr Additions: - print functions take formatting-argument {"Start": "{:,}"} Changes: - print shows sorted stranded data in Start/End order - print dynamically selects number of untraditional strands and hidden columns to display - read_bed now takes nrows arg - now assertion is raised if trying to drop "Chromosome", "Start" or "End" (instead of ignoring) - to_bed, to_gtf, to_csv now take compression argument ("infer" by default) - to_csv writes the header as default # 0.0.51 (01.08.19) Additions: - pr.itergrs added to iterate over the dfs from multiple pyranges at the same time Changes: - pybigwig and bamread are optional dependencies that need to be manually installed (like ray) # 0.0.50 (29.07.19) Additions: - pr.random(n=1000, length=100, chromsizes=None, strand=True) creates a random PyRanges from a PyRanges of chromosome sizes. Changes: - make __iter__ return natsorted items Removals: - insert. use join instead Fixes: - bug in boolean indexing due to __iter__ returning wrong sort order # 0.0.49 (26.07.19) Hotfix: - bug in assign (strand=False, by default, not None) # 0.0.48 (25.07.19) Additions: - head(n=8) - tail(n=8) - sample(n=8) - set_columns(new_names) to set new column names - argument like to drop, which takes string describing regex (gr.drop(like="_left|_right")) - add count (number of intervals) to merge and merge_by Fixes: - 5X faster boolean indexing - fix some bugs in features.introns when data was missing Changes: - coverage renamed to_rle - if drop used without argument, not dropping Strand by default # 0.0.47 (19.07.2019) hotfixes # 0.0.46 (19.07.2019) Additions: - cluster and merge takes argument by to only merge cluster within specific features - gr.features.introns added. Can use by="gene" or by="transcript" - new data: pr.data.gencode_gtf and pr.data.ucsc_bed - can subset pyrange with boolean vector - sort also takes argument by (sort without arg sorts on start/end) # 0.0.45 (14.06.2019) Fixes: - bug in subset which removed strand - bug when setting Strand with setattr - bug when setting Chromosome with setattr Changes: - new method to compute cluster (3x as fast) - string-arg to drop not interpreted as regex - drop or keep do not take drop_strand. Only unstrand can drop strand. Additions: - subsetting with new col order rearranges columns # 0.0.44 (04.06.19) Changes: - Now possible to reset Strand/Chromosome Additions: - gr.drop_duplicate_positions(strand=None) # None means auto => true if stranded otherwise False - add test data pr.data.chromsizes() - pr.gf.tile_genome(genome_pyrange, tile_size, tile_last=False) (like GenomicRanges tileGenome) - pr.gf.genome_bounds(pyrange, genome_pyrange, clip=False) (like UCSC bedclip) # 0.0.43 (29.05.19) Fixes: - fix bug in tostring - fix bug in multithreading Additions: - add apply_chunks, which operates on chunks, instead of chromosome-dfs. Changes: - add nb_cpu argument to all functions - add number of columns and stranded/unstranded to tostring - add ... as last column, when there are more columns than possible to show - use , as thousands separator in tostring for number rows/cols # 0.0.42 (16.05.19) Additions: - allow keyword-arguments to apply, apply_pair (see example in the docs) Changes: - to_csv etc. returns the objects themselves, so they can be used in method chains - methods called tile/window instead of tiles/windows Fixes: - fix print when len(pr) < entries to print - tile # 0.0.41 (14.05.19) Additions: - add slack-flag to cluster/merge - print joined positions possible - add simple methods for printing without breaking the chain (p, mp, sp, tmp, rp) Removals: - settings in pyranges. Added print methods instead. Improvement: - print methods faster, especially for pyranges with many cols # 0.0.40 (13.05.19) Additions: - pyranges_db now out on PyPI Changes: - PyRanges can now have Strand column with other data than "+" or "-", but it is considered unstranded. - Ensure that slack parameter is always integer. - no keep_metadata-flag in windows. Metadata is always kept. Can call drop() beforehand if metadata should not be kept. Remove: - remove confusing keep flag from drop method (use gr[cols_to_keep] instead) Fixes: - add missing ... in pyranges tostring # 0.0.39 (09.05.19) Removal: - remove sandbox module # 0.0.37-38 (09.05.19) Changes: - pyranges constructor is copy-by-default # 0.0.36 (09.05.19) Additions: - add insert method which uses overlap Changes: - read_bed does not fail when strand is "." - read_bed considers bed unstranded if Strand has other values than +/- # 0.0.35 (26.04.19) Changes: - tssify/tesify renamed five_end/three_end - five_end/three end fails when data does not contain strand Fixes: - slack changed pyrange in-place # 0.0.34 (25.04.19) Fixes: - assign changed pyrange in-place # 0.0.33 (25.04.19) Changes: - minor bugfix # 0.0.32 (25.04.19) Changes: - Use gr.to_bed for output_methods, not gr.out.bed - Remove copy_df flag in constructor; using df.copy() is terser - change flag extended in constructor to int64 (default False) # 0.0.31 (24.04.19) Changes: - Make int32 default for Start/End Additions: - PyRanges now has window-function, like bedtools makewindows Fixes: - getitem sometimes returned int32-pyrange despite being given int64-pyrange - doing nearest two times in a row sometimes failed due to minor suffix-bug # 0.0.30 (23.04.19) Changes: - Make col first argument of assign # 0.0.29 (23.04.19) Changes: - Move pyranges db to own module to remove mysql-requirement (made wheelmaking hard) Additions: - add assign and subset methods on pyrange # 0.0.28 (22.04.19) - Only refer to and use ray in dispatcher # 0.0.27 (22.04.19) Fixes: - raise Exception when encountering non-"+-" Strand values # 0.0.26 (15.04.19) Additions: - pr.sandbox.Debug context manager for pipes Fixes: - coverage errored with value_col # 0.0.25 (15.04.19) Additions: - Can set columns on a PyRanges using a dict of iterables - gr() takes subset and col argument, like dplyr mutate and select Removed: - disallow eval string, must use lambdas, e.g.: gr(lambda df: df.Score > 0) Fixes: - drop (and getitem) small fix - sometimes had empty dfs in dict because of unused categoricals # 0.0.24 (15.04.19) Hotfix: - left in dbg statements # 0.0.23 (15.04.19) Hotfix: - unstrand() did not always remove strand info # 0.0.22 (14.04.19) Additions: - pr.PyRanges() returns empty PyRange # before you needed pr.PyRanges({}) - pyranges are now callable. Examples: gr("df.Score > 0") and gr("df.A.astype(str) + mysuffix") - can subset PyRanges with a dict of boolean vectors - pr.data.exons(), pr.data.cpg() - gr.unstrand() removes strand information from a PyRanges - throw exception if trying to drop Strand from df without setting drop_strand=True - adding a Strand column to the PyRanges makes it stranded Changes: - write dtype as category, not int8/int16/... Fixes: - remove empty dfs in the dict given to the PyRanges constructor Removed: - gr.data.epigenome_roadmap() # 0.0.21 (14.04.19) Additions: - gr.cluster(): assign ID to each cluster found by merge - gr.columns: return the columns in the pyranges - gr.drop: drop columns based on regex or list - gr[["Score", "Name"]]: select subset of columns Fixes: - gr.stranded errored if chromosomes were ints