summaryrefslogtreecommitdiff
path: root/NEWS
blob: 1f4177d80412e0196c9e00aa9e2aaed7daa6029a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
## Release 1.19 (12th December 2023)


Changes affecting the whole of bcftools, or multiple commands:

* Filtering expressions can be given a file with list of strings to match, this
  was previously possible only for the ID column. For example

    ID=@file            .. selects lines with ID present in the file
    INFO/TAG=@file.txt  .. selects lines where TAG has a string value listed in the file
    INFO/TAG!=@file.txt .. TAG must not have a string value listed in the file

  Allow to query REF,ALT columns directly, for example

    -e 'REF="N"'


Changes affecting specific commands:

* bcftools annotate

    - Fix `bcftools annotate --mark-sites`, VCF sites overlapping regions in a BED file
      were not annotated (#1989)

    - Add flexibility to FILTER column transfers and allow transfers within the same file,
      across files, and in combination. For examples see
        http://samtools.github.io/bcftools/howtos/annotate.html#transfer_filter_to_info

* bcftools call

    - Output MIN_DP rather than MinDP in gVCF mode

    - New `-*, --keep-unseen-allele` option to output the unobserved allele <*>,
      intended for gVCF.

* bcftools head

    - New `-s, --samples` option to include the #CHROM header line with samples.

* bcftools gtcheck

    - Add output options `-o, --output` and `-O, --output-type`

    - Add filtering options `-i, --include` and `-e, --exclude`

    - Rename the short option `-e, --error-probability` from lower case to upper
      case `-E, --error-probability`

    - Changes to the output format, replace the DC section with DCv2:

        - adds a new column for the number of matching genotypes

        - The --error-probability is newly interpreted as the probability of erroneous
          allele rather than genotype. In other words, the calculation of the discordance
          score now considers the probability of genotyping error to be different
          for HOM and HET genotypes, i.e. P(0/1|dsg=0) > P(1/1|dsg=0).

        - fixes in HWE score calculation plus output average HWE score rather
          than absolute HWE score

        - better description of fields

* bcftools merge

    - Add `-m` modifiers to suppress the output of the unseen allele <*> or <NON_REF>
      at variant sites (e.g. `-m both,*`) or all sites (e.g. `-m both,**`)

* bcftools mpileup

    - Output MIN_DP rather than MinDP in gVCF mode

* bcftools norm

    - Add the number of joined lines to the summary output, for example

        Lines   total/split/joined/realigned/skipped:	6/0/3/0/0

    - Allow combining -m and -a with --old-rec-tag (#2020)

    - Symbolic <DEL> alleles caused norm to expand REF to the full length of the deletion.
      This was not intended and problematic for long deletions, the REF allele should list
      one base only (#2029)

* bcftools query

    - Add new `-N, --disable-automatic-newline` option for pre-1.18 query formatting behavior
      when newline would not be added when missing

    - Make the automatic addition of the newline character in a more predictable way and,
      when missing, always put it at the end of the expression. In version 1.18 it could
      be added at the end of the expression (for per-site expressions) or inside the square
      brackets (for per-sample expressions). The new behavior is:

        - if the formatting expression contains a newline character, do nothing
        - if there is no newline character and -N, --disable-automatic-newline is given, do nothing
        - if there is no newline character and -N is not given, insert newline at the end of the expression

      See #1969 for details

    - Add new `-F, --print-filtered` option to output a default string for samples that would otherwise
      be filtered by `-i/-e` expressions.

    - Include sample name in the output header with `-H` whenever it makes sense (#1992)

* bcftools +spit-vep

    - Fix on the fly filtering involving numeric subfields, e.g. `-i 'MAX_AF<0.001'` (#2039)

    - Interpret default column type names (--columns-types) as entire strings, rather than
      substrings to avoid unexpected spurious matches (i.e. internally add ^ and $ to all
      field names)

* bcftools +trio-dnm2

    - Do not flag paternal genotyping errors as de novo mutations. Specifically, when father's
      chrX genotype is 0/1 and mother's 0/0, 0/1 in the child will not be marked as DNM.

* bcftools view

    - Add new `-A, --trim-unseen-allele` option to remove the unseen allele <*> or <NON_REF>
      at variant sites (`-A`) or all sites (`-AA`)


## Release 1.18 (25th July 2023)

Changes affecting the whole of bcftools, or multiple commands:

* Support auto indexing during writing BCF and VCF.gz via new `--write-index` option


Changes affecting specific commands:

* bcftools annotate

    - The `-m, --mark-sites` option can be now used to mark all sites without the
      need to provide the `-a` file (#1861)

    - Fix a bug where the `-m` function did not respect the `--min-overlap` option (#1869)

    - Fix a bug when update of INFO/END results in assertion error (#1957)

* bcftools concat

    - New option `--drop-genotypes`

* bcftools consensus

    - Support higher-ploidy genotypes with `-H, --haplotype` (#1892)

    - Allow `--mark-ins` and `--mark-snv` with a character, similarly to `--mark-del`

* bcftools convert

    - Support for conversion from tab-delimited files (CHROM,POS,REF,ALT) to sites-only VCFs

* bcftools csq

    - New `--unify-chr-names` option to automatically unify different chromosome
      naming conventions in the input GFF, fasta and VCF files (e.g. "chrX" vs "X")

    - More versatility in parsing various flavors of GFF

    - A new `--dump-gff` option to help with debugging and investigating the internals
      of hGFF parsing

    - When printing consequences in nonsense mediated decay transcripts, include 'NMD_transcript'
      in the consequence part of the annotation. This is to make filtering easier and analogous to
      VEP annotations. For example the consequence annotation
            3_prime_utr|PCGF3|ENST00000430644|NMD
      is newly printed as
            3_prime_utr&NMD_transcript|PCGF3|ENST00000430644|NMD

* bcftools gtcheck

    - Add stats for the number of sites matched in the GT-vs-GT, GT-vs-PL, etc modes. This
      information is important for interpretation of the discordance score, as only the
      GT-vs-GT matching can be interpreted as the number of mismatching genotypes.

* bcftools +mendelian2

    - Fix in command line argument parsing, the `-p` and `-P` options were not
      functioning (#1906)

* bcftools merge

    - New `-M, --missing-rules` option to control the behavior of merging of vector tags
      to prevent mixtures of known and missing values in tags when desired

    - Use values pertaining to the unknown allele (<*> or <NON_REF>) when available
      to prevent mixtures of known and missing values (#1888)

    - Revamped line matching code to fix problems in gVCF merging where split gVCF blocks
      would not update genotypes (#1891, #1164).

* bcftool mpileup

    - Fix a bug in --indels-v2.0 which caused an endless loop when CIGAR operator 'H' or 'P'
      was encountered

* bcftools norm

    - The `-m, --multiallelics +` mode now preserves phasing (#1893)

    - Symbolic <DEL.*> alleles are now normalized too (#1919)

    - New `-g, --gff-annot` option to right-align indels in forward transcripts to follow
      HGVS 3'rule (#1929)

* bcftools query

    - Force newline character in formatting expression when not given explicitly

    - Fix `-H` header output in formatting expressions containing newlines

* bcftools reheader

    - Make `-f, --fai` aware of long contigs not representable by 32-bit integer (#1959)

* bcftools +split-vep

    - Prevent a segfault when `-i/-e` use a VEP subfield not included in `-f` or `-c` (#1877)

    - New `-X, --keep-sites` option complementing the existing `-x, --drop-sites` options

    - Force newline character in formatting expression when not given explicitly

    - Fix a subtle ambiguity: identical rows must be returned when `-s` is applied regardless
      of `-f` containing the `-a` VEP tag itself or not.

* bcftools stats

    - Collect new VAF (variant allele frequency) statistics from FORMAT/AD field

    - When counting transitions/transversions, consider also alternate het genotypes

* plot-vcfstats

    - Add three new VAF plots


## Release 1.17 (21st February 2023)


Changes affecting the whole of bcftools, or multiple commands:

* The -i/-e filtering expressions

    - Error checks were added to prevent incorrect use of vector arithmetics. For example,
      when evaluating the sum of two vectors A and B, the resulting vector could contain
      nonsense values when the input vectors were not of the same length. The fix introduces
      the following logic:
        - evaluate to C_i = A_i + B_i when length(A)==B(A) and set length(C)=length(A)
        - evaluate to C_i = A_i + B_0 when length(B)=1 and set length(C)=length(A)
        - evaluate to C_i = A_0 + B_i when length(A)=1 and set length(C)=length(B)
        - throw an error when length(A)!=length(B) AND length(A)!=1 AND length(B)!=1

    - Arrays in Number=R tags can be now subscripted by alleles found in FORMAT/GT. For example,

        FORMAT/AD[GT] > 10        .. require support of more than 10 reads for each allele
        FORMAT/AD[0:GT] > 10      .. same as above, but in the first sample
        sSUM(FORMAT/AD[GT]) > 20  .. require total sample depth bigger than 20

* The commands `consensus -H` and `+split-vep -H`

    - Drop unnecessary leading space in the first header column and newly print `#[1]columnName`
      instead of the previous `# [1]columnName` (#1856)


Changes affecting specific commands:

* bcftools +allele-length

    - Fix overflow for indels longer than 512bp and aggregate alleles equal or larger than
      that in the same bin (#1837)

* bcftools annotate

    - Support sample reordering of annotation file (#1785)

    - Restore lost functionality of the --pair-logic option (#1808)

* bcftools call

    - Fix a bug where too many alleles passed to `-C alleles` via `-T` caused memory
      corruption (#1790)

    - Fix a bug where indels constrained with `-C alleles -T` would sometimes be missed (#1706)

* bcftools consensus

    - BREAKING CHANGE: the option `-I, --iupac-codes` newly outputs IUPAC codes based on FORMAT/GT
      of all samples. The `-s, --samples` and `-S, --samples-file` options can be used to subset
      samples. In order to ignore samples and consider only the REF and ALT columns (the original
      behavior prior to 1.17), run with `-s -` (#1828)

* bcftools convert

    - Make variantkey conversion work for sites without an ALT allele (#1806)

* bcftool csq

    - Fix a bug where a MNV with multiple consequences (e.g. missense + stop_gained)
      would report only the less severe one (#1810)

    - GFF file parsing was made slightly more flexible, newly ids can be just 'XXX'
      rather than, for example, 'gene:XXX'

    - New gff2gff perl script to fix GFF formatting differences

* bcftools +fill-tags

    - More of the available annotations are now added by the `-t all` option

* bcftools +fixref

    - New INFO/FIXREF annotation

    - New -m swap mode

* bcftools +mendelian

    - The +mendelian plugin has been deprecated and replaced with +mendelian2. The
      function of the plugin is the same but the command line options and the output
      format has changed, and for this was introduced as a new plugin.

* bcftools mpileup

    - Most of the annotations generated by mpileup are now optional via the
      `-a, --annotate` option and add several new (mostly experimental) annotations.

    - New option `--indels-2.0` for an EXPERIMENTAL indel calling model. This model aims
      to address some known deficiencies of the current indel calling algorithm, specifically,
      it uses diploid reference consensus sequence. Note that in the current version it
      has the potential to increase sensitivity but at the cost of decreased specificity.

    - Make the FS annotation (Fisher exact test strand bias) functional and remove it
      from the default annotations

* bcftools norm

    - New --multi-overlaps option allows one to set overlapping alleles either to the
      ref allele (the current default) or to a missing allele (#1764 and #1802)

    - Fixed a bug in `-m -` which does not split missing FORMAT values correctly and
      could lead to empty FORMAT fields such as `::` instead of the correct `:.:` (#1818)

    - The `--atomize` option previously would not split complex indels such as C>GGG.
      Newly these will be split into two records C>G and C>CGG (#1832)

* bcftools query

    - Fix a rare bug where the printing of SAMPLE field with `query` was incorrectly
      suppressed when the `-e` option contained a sample expression while the formatting
      query did not. See #1783 for details.

* bcftools +setGT

    - Add new `--new-gt X` option (#1800)

    - Add new `--target-gt r:FLOAT` option to randomly select a proportion of genotypes (#1850)

    - Fix a bug where `-t ./x` mode was advertised as selecting both phased and unphased
      half-missing genotypes, but was in fact selecting only unphased genotypes (#1844)

* bcftools +split-vep

    - New options `-g, --gene-list` and `--gene-list-fields` which allow to prioritize
      consequences from a list of genes, or restrict output to the listed genes

    - New `-H, --print-header` option to print the header with `-f`

    - Work around a bug in the LOFTEE VEP plugin used to annotate gnomAD VCFs. There the
      LoF_info subfield contains commas which, in general, makes it impossible to parse the
      VEP subfields. The +split-vep plugin can now work with such files, replacing the offending
      commas with slash (/) characters. See also https://github.com/Ensembl/ensembl-vep/issues/1351

    - Newly the `-c, --columns` option can be omitted when a subfield is used in `-i/-e` filtering
      expression. Note that `-c` may still have to be given when it is not possible to infer the
      type of the subfield. Note that this is an experimental feature.

* bcftools stats

    - The per-sample stats (PSC) would not be computed when `-i/-e` filtering options and
      the `-s -` option were given but the expression did not include sample columns (1835)

* bcftools +tag2tag

    - Revamp of the plugin to allow wider range of tag conversions, specifically all combinations
      from FORMAT/GL,PL,GP to FORMAT/GL,PL,GP,GT

* bcftools +trio-dnm2

    - New `-n, --strictly-novel` option to downplay alleles which violate Mendelian
      inheritance but are not novel

    - Allow to set the `--pn` and `--pns` options separately for SNVs and indels and make
      the indel settings more strict by default

    - Output missing FORMAT/VAF values in non-trio samples, rather than random nonsense values

* bcftools +variant-distance

    - New option `-d, --direction` to choose the directionality: forward, reverse, nearest (the default)
      or both (#1829)


## Release 1.16 (18th August 2022)

* New plugin `bcftools +variant-distance` to annotate records with distance to the
  nearest variant (#1690)


Changes affecting the whole of bcftools, or multiple commands:

* The -i/-e filtering expressions

    - Added support for querying of multiple filters, for example `-i 'FILTER="A;B"'`
      can be used to select sites with two filters "A" and "B" set. See the documentation
      for more examples.

    - Added modulo arithmetic operator

Changes affecting specific commands:

* bcftools annotate

    - A bug introduced in 1.14 caused that records with INFO/END annotation would
      incorrectly trigger `-c ~INFO/END` mode of comparison even when not explicitly
      requested, which would result in not transferring the annotation from a tab-delimited
      file (#1733)

* bcftools merge

    - New `-m snp-ins-del` switch to merge SNVs, insertions and deletions separately (#1704)

* bcftools mpileup

    - New NMBZ annotation for Mann-Whitney U-z test on number of mismatches within
      supporting reads

    - Suppress the output of MQSBZ and FS annotations in absence of alternate allele

* bcftools +scatter

    - Fix erroneous addition of duplicate PG lines

* bcftools +setGT

    - Custom genotypes (e.g. `-n c:1/1`) now correctly override ploidy

## Release 1.15.1 (7th April 2022)


* bcftools annotate

    - New `-H, --header-line` convenience option to pass a header line on command line,
      this complements the existing `-h, --header-lines` option which requires a file
      with header lines

* bcftools csq

    - A list of consequence types supported by `bcftools csq` has been added to
      the manual page. (#1671)

* bcftools +fill-tags

    - Extend generalized functions so that FORMAT tags can be filled as well, for example:

        bcftools +fill-tags in.bcf -o out.bcf -- -t 'FORMAT/DP:1=int(smpl_sum(FORMAT/AD))'

    - Allow multiple custom functions in a single run. Previously the program would silently
      go with the last one, assigning the same values to all (#1684)

* bcftools norm

    - Fix an assertion failure triggered when a faulty VCF file with a '-'
      character in the REF allele was used with `bcftools norm --atomize`.  This
      option now checks that the REF allele only includes the allowed characters
      A, C, G, T and N. (#1668)

    - Fix the loss of phasing in half-missing genotypes in variant atomization (#1689)

* bcftools roh

    - Fix a bug that could result in an endless loop or incorrect AF estimate when
      missing genotypes are present and the `--estimate-AF -` option was used (#1687)

* bcftools +split-vep

    - VEP fields with characters disallowed in VCF tag names by the specification (such as '-'
      in 'M-CAP') couldn't be queried. This has been fixed, the program now sanitizes the field
      names, replacing invalid characters with underscore (#1686)


## Release 1.15 (21st February 2022)

* New `bcftools head` subcommand for conveniently displaying the headers
  of a VCF or BCF file. Without any options, this is equivalent to
  `bcftools view --header-only --no-version` but more succinct and memorable.

* The `-T, --targets-file` option had the following bug originating in HTSlib code:
  when an uncompressed file with multiple columns CHR,POS,REF was provided, the
  REF would be interpreted as 0 gigabases (#1598)

Changes affecting specific commands:

* bcftools annotate

    - In addition to `--rename-annots`, which requires a file with name mappings,
      it is now possible to do the same on the command line `-c NEW_TAG:=OLD_TAG`

    - Add new option --min-overlap which allows one to specify the minimum required
      overlap of intersecting regions

    - Allow to transfer ALT from VCF with or without replacement using
        bcftools annotate -a annots.vcf.gz -c ALT file.vcf.gz
        bcftools annotate -a annots.vcf.gz -c +ALT file.vcf.gz

* bcftools convert

    - Revamp of `--gensample`, `--hapsample` and `--haplegendsample` family of options
      which includes the following changes:

    - New `--3N6` option to output/input the new version of the .gen file format,
      see https://www.cog-genomics.org/plink/2.0/formats#gen

    - Deprecate the `--chrom` option in favor of `--3N6`. A simple `cut` command
      can be used to convert from the new 3*M+6 column format to the format printed
      with `--chrom` (`cut -d' ' -f1,3-`).

    - The CHROM:POS_REF_ALT IDs which are used to detect strand swaps are required
      and must appear either in the "SNP ID" column or the "rsID" column. The column
      is autodetected for `--gensample2vcf`, can be the first or the second for
      `--hapsample2vcf` (depending on whether the `--vcf-ids` option is given), must be
      the first for `--haplegendsample2vcf`.

* bcftools csq

    - Allow GFF files with phase column unset

* bcftools filter

    - New `--mask`, `--mask-file` and `--mask-overlap` options to soft filter
      variants in regions (#1635)

* bcftools +fixref

    - The `-m id` option now works also for non-dbSNP ids, i.e. not just `rsINT`

    - New `-m flip-all` mode for flipping all sites, including ambiguous A/T and C/G sites

* bcftools isec

    - Prevent segfault on sites filtered with -i/-e in all files (#1632)

* bcftools mpileup

    - More flexible read filtering using the options
        --ls, --skip-all-set    ..  skip reads with all of the FLAG bits set
        --ns, --skip-any-set    ..  skip reads with any of the FLAG bits set
        --lu, --skip-all-unset  ..  skip reads with all of the FLAG bits unset
        --nu, --skip-any-unset  ..  skip reads with any of the FLAG bits unset

      The existing synonymous options will continue to function but their use
      is discouraged
        --rf, --incl-flags STR|INT  Required flags: skip reads with mask bits unset
        --ff, --excl-flags STR|INT  Filter flags: skip reads with mask bits set

* bcftools query

    - Make the `--samples` and `--samples-file` options work also in the `--list-samples`
      mode. Add a new `--force-samples` option which allows one to proceed even when some of
      the requested samples are not present in the VCF (#1631)

* bcftools +setGT

    - Fix a bug in `-t q -e EXPR` logic applied on FORMAT fields, sites with all
      samples failing the expression EXPR were incorrectly skipped. This problem
      affected only the use of `-e` logic, not the `-i` expressions (#1607)

* bcftools sort

    - make use of the TMPDIR environment variable when defined

* bcftools +trio-dnm2

    - The --use-NAIVE mode now also adds the de novo allele in FORMAT/VA


## Release 1.14 (22nd October 2021)


Changes affecting the whole of bcftools, or multiple commands:

* New `--regions-overlap` and `--targets-overlap` options which address
  a long-standing design problem with subsetting VCF files by region.
  BCFtools recognize two sets of options, one for streaming (`-t/-T`) and
  one for index-gumping (`-r/-R`). They behave differently, the first
  includes only records with POS coordinate within the regions, the other
  includes overlapping regions. The two new options allow to modify the
  default behavior, see the man page for more details.

* The `--output-type` option can be used to override the default compression
  level

Changes affecting specific commands:

* bcftools annotate

    - when `--set-id` and `--remove` are combined, `--set-id` cannot use
      tags deleted by `--remove`. This is now detected and the program
      exists with an informative error message instead of segfaulting
      (#1540)

    - while non-symbolic variation are uniquely identified by POS,REF,ALT,
      symbolic alleles starting at the same position were undistinguishable.
      This prevented correct matching of records with the same positions and
      variant type but different length given by INFO/END (samtools/htslib@60977f2).
      When annotating froma VCF/BCF, the matching is done automatically. When
      annotating from a tab-delimited text file, this feature can be invoked
      by using `-c INFO/END`.

    - add a new '.' modifier to control wheter missing values should be carried
      over from a tab-delimited file or not. For example:

         -c TAG .. adds TAG if the source value is not missing. If TAG
                   exists in the target file, it will be overwritten

        -c .TAG .. adds TAG even if the source value is missing. This
                   can overwrite non-missing values with a missing value
                   and can create empty VCF fields (`TAG=.`)

* bcftools +check-ploidy

    - by default missing genotypes are not used when determining ploidy.
      With the new option `-m, --use-missing` it is possible to use the
      information carried in the missing and half-missing genotypes
      (e.g. ".", "./." or "./1")

* bcftools concat:

    - new `--ligate-force` and `--ligate-warn` options for finer control
      of `-l, --ligate` behavior in imperfect overlaps. The new default is
      to throw an error when sites present in one chunk but absent in the
      other are encountered. To drop such sites and proceed, use the new
      `--ligate-warn` option (previously this was the default). To keep such
      sites, use the new `--ligate-force` option (#1567).

* bcftools consensus:

    - Apply mask even when the VCF has no notion about the chromosome. It
      was possible to encounter this problem when `contig` lines were not
      present in the VCF header and no variants were called on that chromosome
      (#1592)

* bcftools +contrast:

    - support for chunking within map/reduce framework allowing to collect
      NASSOC counts even for empty case/control sample sets (#1566)

* bcftools csq:

    - bug fix, compound indels were not recognised in some cases (#1536)

    - compound variants were incorrectly marked as 'inframe' even when
      stop codon would occur before the frame was restored (#1551)

    - bug fix, FORMAT/BCSQ bitmasks could have been assigned incorrectly
      to some samples at multiallelic sites, a superset of the correct
      consequences would have been set (#1539)

    - bug fix, the upstream stop could be falsely assigned to all samples in
      a multi-sample VCF even if the stop was relevant for a single sample
      only (#1578)

    - further improve the detection of mismatching chromosome naming
      (e.g. "chrX" vs "X") in the GFF, VCF and fasta files

* bcftools merge:

    - keep (sum) INFO/AN,AC values when merging VCFs with no samples (#1394)

* bcftools mpileup:

    - new --indel-size option which allows one to increase the maximum considered
      indel size considered, large deletions in long read data are otherwise
      lost.

* bcftools norm:

    - atomization now supports Number=A,R string annotations (#1503)

    - assign as many alternate alleles to genotypes at multiallelic sites
      in the`-m +` mode, disregarding the phase.  Previously the program
      assumed to be executed as an inverse operation of `-m -`, but when
      that was not the case, reference alleles would have been filled
      instead of multiple alternate alleles (#1542)

* bcftools sort:

    - increase accuracy of the --max-mem option limit, previously the limit
      could be exceeded by more than 20% (#1576)

* bcftools +trio-dnm:

    - new `--with-pAD` option to allow processing of VCFs without FORMAT/QS.
      The existing `--ppl` option was changed to the analogous `--with-pPL`

* bcftools view:

    - the functionality of the option --compression-level lost in 1.12
      has been restored


## Release 1.13 (7th July 2021)


This release brings new options and significant changes in BAQ parametrization
in `bcftools mpileup`. The previous behavior can be triggered by providing
the `--config 1.12` option. Please see https://github.com/samtools/bcftools/pull/1474
for details.


Changes affecting the whole of bcftools, or multiple commands:

* Improved build system


Changes affecting specific commands:

* bcftools annotate:

    - Fix rare a bug when INFO/END is present, all INFO fields are removed
      with `bcftools annotate -x INFO` and BCF output is produced. Then the
      removed INFO/END continues to inform the end coordinate and causes
      incorrect retrieval of records with the -r option (#1483)

    - Support for matching annotation line by ID, in addition to CHROM,POS,REF,
      and ALT (#1461)

        bcftools annotate -a annots.tab.gz  -c CHROM,POS,~ID,REF,ALT,INFO/END input.vcf

* bcftools csq:

    - When GFF and VCF/fasta use a different chromosome naming convention
     (e.g. chrX vs X), no consequences would be added. Newly the program
     attempts to detect these differences and remove/add the "chr" prefix
     to chromosome name to match the GFF and VCF/fasta (#1507)

    - Parametrize brief-predictions parameter to allow explicit number of
      aminoacids to be printed. Note that the `-b, --brief-predictions` option
      is being replaced with `-B, --trim-protein-seq INT`

* bcftools +fill-tags:

    - Generalization and better support for custom functions that allow
      adding new INFO tags based on arbitrary `-i, --include` type of
      expressions. For example, to calculate a missing INFO/DP annotation
      from FORMAT/AD, it is possible to use:

        -t 'DP:1=int(sum(FORMAT/AD))'

      Here the optional ":1" part specifies that a single value will be
      added (by default Number=. is used) and the optional int(...) adds
      an integer value (by default Type=Float is used).

    - When FORMAT/GT is not present, the INFO/AF tag will be newly calculated
      from INFO/AC and INFO/AN.

* bcftools gtcheck:

    - Switch between FORMAT/GT or FORMAT/PL when one is (implicitly) requested
      but only the other is available

    - Improve diagnostics, printing warnings when a line cannot be matched and
      the number of lines skipped for various reasons (#1444)

    - Minor bug fix, with PLs being the default, the `--distinctive-sites` option
      started to require explicit `--error-probability 0`

* bcftools index:

    - The program now accepts both data file name and the index file name. This
      adds to user convenience when running index statistics (-n, -s)

* bcftools isec:

    - Always generate sites.txt with isec -p (#1462)

* bcftools +mendelian:

    - Consider only complete trios, do not crash on sample name typos (#1520)

* bcftools mpileup:

    - New `--seed` option for reproducibility of subsampling code in HTSlib

    - The SCR annotation which shows the number of soft-clipped reads now
      correctly pools reads together regardless of the variant type. Previously
      only reads with indels were included at indel sites.

    - Major revamp of BAQ. Please see https://github.com/samtools/bcftools/pull/1474
      for details. The previous behavior can be triggered by providing the `--config 1.12`
      option.

    - Thanks to improvements in HTSlib, the removal of overlapping reads (which can
      be disabled with the `-x, --ignore-overlaps` options) is not systematically biased
      anymore (https://github.com/samtools/htslib/pull/1273)

    - Modified scale of Mann-Whitney U tests. Newly INFO/*Z annotations will be printed,
      for example MQBZ replaces MQB.

* bcftools norm:

    - Fix Type=Flag output in `norm --atomize` (#1472)

    - Atomization must not discard ALT=. records

    - Atomization of AD and QS tags now correctly updates occurrences of duplicate
      alleles within different haplotypes

    - Fix a bug in atomization of Number=A,R tags

* bcftools reheader:

    - Add `-T, --temp-prefix` option

* bcftools +setGT:

    - A wider range of genotypes can be set by the plugin by allowing
      specifying custom genotypes. For example, to force a heterozygous
      genotype it is now possible to use expressions like:

        c:'m|M'
        c:0/1
        c:0

* bcftools +split-vep:

    - New `-u, --allow-undef-tags` option

    - Better handling of ambiguous keys such as INFO/AF and CSQ/AD. The
      `-p, --annot-prefix` option is now applied before doing anything else
      which allows its use with `-f, --format` and `-c, --columns` options.

    - Some consequence field names may not constitute a valid tag name, such
      as "pos(1-based)". Newly field names are trimmed to exclude brackets.

* bcftools +tag2tag:

    - New --QR-QA-to-QS option to convert annotations generated by Freebays
      to QS used by BCFtools

* bcftools +trio-dnm:

    - Add support for sites with more than four alleles. Note that only the
      four most frequent alleles are considered, the model remains unchanged.
      Previously such sites were skipped.

    - New --use-NAIVE option for a naive DNM calling based solely on FORMAT/GT
      and expected Mendelian inheritance. This option is suitable for prefiltering.

    - Fix behavior to match the documentation, the `--dnm-tag DNG` option now
      correctly outputs log scaled values by default, not phred scaled.

    - Fix bug in VAF calculation, homozygous de novo variants were incorrectly
      reported as having VAF=50%

    - Fix arithmetic underflow which could lead to imprecise scores and improve
      sensitivity in high coverage regions

    - Allow combining --pn and --pns to set the noise trehsholds independently


## Release 1.12 (17th March 2021)

Changes affecting the whole of bcftools, or multiple commands:

* The output file type is determined from the output file name suffix, where
  available, so the -O/--output-type option is often no longer necessary.

* Make F_MISSING in filtering expressions work for sites with multiple
  ALT alleles (#1343)

* Fix N_PASS and F_PASS to behave according to expectation when reverse
  logic is used (#1397). This fix has the side effect of `query` (or
  programs like `+trio-stats`) behaving differently with these expressions,
  operating now in site-oriented rather than sample-oriented mode. For
  example, the new behavior could be:
    bcftools query -f'[%POS %SAMPLE %GT\n]' -i'N_PASS(GT="alt")==1'
        11	A	0/0
        11	B	0/0
        11	C	1/1
  while previously the same expression would return:
        11	C	1/1
  The original mode can be mimicked by splitting the filtering into two steps:
    bcftools view -i'N_PASS(GT="alt")==1' | \
    bcftools query -f'[%POS %SAMPLE %GT\n]' -i'GT="alt"'

Changes affecting specific commands:

* bcftools annotate:

    - New `--rename-annots` option to help fix broken VCFs (#1335)

    - New -C option allows one to read a long list of options from a file to
      prevent very long command lines.

    - New `append-missing` logic allows annotations to be added for each ALT
      allele in the same order as they appear in the VCF. Note that this is
      not bullet proof. In order for this to work:

        - the annotation file must have one line per ALT allele

        - fields must contain a single value as multiple values are appended
          as they are and would break the correspondence between the alleles
          and values

* bcftools concat:

    - Do not phase genotypes by mistake if they are not already phased
      with `-l` (#1346)

* bcftools consensus:

    - New `--mask-with`, `--mark-del`, `--mark-ins`, `--mark-snv` options
      (#1382, #1381, #1170)

    - Symbolic <DEL> should have only one REF base. If there are multiple,
      take POS+1 as the first deleted base.

    - Make consensus work when the first base of the reference genome is
      deleted. In this situation the VCF record has POS=1 and the first
      REF base cannot precede the event. (#1330)

* bcftools +contrast:

    - The NOVELGT annotation was previously not added when requested.

* bcftools convert:

    - Make the --hapsample and --hapsample2vcf options consistent with each
      other and with the documentation.

* bcftools call:

    - Revamp of `call -G`, previously sample grouping by population was not
      truly independent and could still be influenced by the presence of other
      sample groups.

    - Optional addition of INFO/PV4 annotation with `call -a INFO/PV4`

    - Remove generation of useless HOB and ICB annotation;
      use `+fill-tags -- -t HWE,ExcHet` instead

    - The `call -f` option was renamed to `-a` to (1) make it consistent with
      `mpileup` and (2) to indicate that it includes both INFO and FORMAT
      annotations, not just FORMAT as previously

    - Any sensible Number=R,Type=Integer annotation can be used with -G,
      such as AD or QS

    - Don't trim QUAL; although usefulness of this change is questionable for
      true probabilistic interpretation (such high precision is unrealistic),
      using QUAL as a score rather than probability is helpful and permits more
      fine-grained filtering

    - Fix a suspected bug in `call -F` in the worst case, for certain improve
      readability

    - `call -C trio` is temporarily disabled

* bcftools csq:

    - Fix a bug wich caused incorrect FORMAT/BCSQ formatting at sites with too
      many per-sample consequences

    - Fix a bug which incorrectly handled the --ncsq parameter and could clash
      with reserved BCF values, consequently producing truncated or even incorrect
      output of the %TBCSQ formatting expression in `bcftools query`. To account
      for the reserved values, the new default value is --ncsq 15 (#1428)

* bcftools +fill-tags:

    - MAF definition revised for multiallelic sites, the second most common
      allele is considered to be the minor allele (#1313)

    - New FORMAT/VAF, VAF1 annotations to set the fraction of alternate reads
      provided FORMAT/AD is present

* bcftools gtcheck:

    - support matching of a single sample against all other samples in the file
      with `-s qry:sample -s gt:-`. This was previously not possible, either
      full cross-check mode had to be run or a list of pairs/samples had to
      be created explicitly

* bcftools merge:

    - Make `merge -R` behavior consistent with other commands and pull in
      overlapping records with POS outside of the regions (#1374)

    - Bug fix (#1353)

* bcftools mpileup:

    - Add new optional tag `mpileup -a FORMAT/QS`

* bcftools norm:

    - New `-a, --atomize` functionality to decompose complex variants,
      for example MNVs into consecutive SNVs

    - New option `--old-rec-tag` to indicate the original variant

* bcftools query:

    - Incorrect fields were printed in the per-sample output when subset
      of samples was requested via -s/-S and the order of samples in the
      header was different from the requested -s/-S order (#1435)

* bcftools +prune:

    - New options --random-seed and --nsites-per-win-mode (#1050)

* bcftools +split-vep:

    - Transcript selection now works also on the raw CSQ/BCSQ annotation.

    - Bug fix, samples were dropped on VCF input and VCF/BCF output (#1349)

* bcftools stats:

    - Changes to QUAL and ts/tv plotting stats: avoid capping QUAL to
      predefined bins, use an open-range logarithmic binning instead

    - plot dual ts/tv stats: per quality bin and cumulative as if threshold
      applied on the whole dataset

* bcftools +trio-dnm2:

    - Major revamp of +trio-dnm plugin, which is now deprecated and replaced by
      +trio-dnm2.

      The original trio-dnm calling model used genotype likelihoods (PLs) as the
      input for calling. However, that is flawed because PLs make assumptions
      which are unsuitable for de novo calling: PL(RR) can become bigger than
      PL(RA) even when the ALT allele is present in the parents. Note that
      this is true also for other programs such as DeNovoGear which rely on
      the same samtools calculation.

      The new recommended workflow is

        bcftools mpileup -a AD,QS -f ref.fa -Ou proband.bam father.bam mother.bam |
           bcftools call -mv -Ou |
           bcftools +trio-dnm -p proband,father,mother -Oz -o output.vcf.gz

      This new version also implements the DeNovoGear model. The original
      behavior of trio-dnm is no longer supported.

      For more details see http://samtools.github.io/bcftools/trio-dnm.pdf


## Release 1.11 (22nd September 2020)


Changes affecting the whole of bcftools, or multiple commands:

* Filtering -i/-e expressions

    - Breaking change in -i/-e expressions on the FILTER column.  Originally
      it was possible to query only a subset of filters, but not an exact match.
      The new behavior is:

        FILTER="A"          .. exact match, for example "A;B" does not pass
        FILTER!="A"         .. exact match, for example "A;B" does pass
        FILTER~"A"          .. both "A" and "A;B" pass
        FILTER!~"A"         .. neither "A" nor "A;B" pass

    - Fix in commutative comparison operators, in some cases reversing sides
      would produce incorrect results (#1224; #1266)

    - Better support for filtering on sample subsests

    - Add SMPL_*/S* family of functions that evaluate within rather than across
      all samples. (#1180)

* Improvements in the build system


Changes affecting specific commands:

* bcftools annotate:

    - Previously it was not possible to use `--columns =TAG` with INFO tags
      and the `--merge-logic` feature was restricted to tab files with BEG,END
      columns, now extended to work also with REF,ALT.

    - Make `annotate -TAG/+TAG` work also with FORMAT fields. (#1259)

    - ID and FILTER can be transferred to INFO and ID can be populated from
      INFO.  However, the FILTER column still cannot be populated from an INFO
      tag because all possible FILTER values must be known at the time of
      writing the header (#947; #1187)

* bcftools consensus:

    - Fix in handling symbolic deletions and overlapping variants.
      (#1149; #1155; #1295)

    - Fix `--iupac-codes` crash on REF-only positions with `ALT="."`. (#1273)

    - Fix `--chain` crash. (#1245)

    - Preserve the case of the genome reference. (#1150)

    - Add new `-a, --absent` option which allows one to set positions with no
      supporting evidence to "N" (or any other character). (#848; #940)

* bcftools convert:

    - The option `--vcf-ids` now works also with `-haplegendsample2vcf`. (#1217)

    - New option `--keep-duplicates`

* bcftools csq:

    - Add `misc/gff2gff.py` script for conversion between various flavors of
      GFF files. The initial commit supports only one type and was contributed
      by @flashton2003. (#530)

    - Add missing consequence types. (PR #1203; #1292)

    - Allow overlapping CDS to support ribosomal slippage. (#1208)

* bcftools +fill-tags:

    - Added new annotations: INFO/END, TYPE, F_MISSING.

* bcftools filter:

    - Make `--SnpGap` optionally filter also SNPs close to other variant types.
      (#1126)

* bcftools gtcheck:

    - Complete revamp of the command. The new version is faster and allows
      N:M sample comparisons, not just 1:N or NxN comparisons.
      Some functionality was lost (plotting and clustering) but may be added
      back on popular demand.

* bcftools +mendelian:

    - Revamp of user options, output VCFs with mendelian errors annotation,
      read PED files (thanks to Giulio Genovese).

* bcftools merge:

    - Update headers when appropriate with the '--info-rules *:join' INFO rule.
      (#1282)

    - Local alleles merging that produce LAA and LPL when requested, a draft
      implementation of https://github.com/samtools/hts-specs/pull/434 (#1138)

    - New `--no-index` which allows one to merge unindexed files. Requires the input
      files to have chromosomes in th same order and consistent with the order
      of sequences in the header. (PR #1253; samtools/htslib#1089)

    - Fixes in gVCF merging. (#1127; #1164)

* bcftools norm:

    - Fixes in `--check-ref s` reference setting features with non-ACGT bases.
      (#473; #1300)

    - New `--keep-sum` switch to keep vector sum constant when splitting
      multiallelics. (#360)

* bcftools +prune:

    - Extend to allow annotating with various LD metrics: r^2,
      Lewontin's D' (PMID:19433632), or Ragsdale's D (PMID:31697386).

* bcftools query:

    - New `%N_PASS()` formatting expression to output the number of samples
      that pass the filtering expression.

* bcftools reheader:

    - Improved error reporting to prevent user mistakes. (#1288)

* bcftools roh:

    - Several fixes and improvements
        - the `--AF-file` description incorrectly suggested "REF\tALT" instead
          of the correct "REF,ALT". (#1142)
        - RG lines could have negative length. (#1144)
        - new `--include-noalt` option to allow also ALT=. records. (#1137)

* bcftools scatter:

    - New plugin intended as a convenient inverse to `concat`
      (thanks to Giulio Genovese, PR #1249)

* bcftools +split:

    - New `--groups-file` option for more flexibility of defining desired
      output. (#1240)

    - New `--hts-opts` option to reduce required memory by reusing one
      output header and allow overriding the default hFile's block size
      with `--hts-opts block_size=XXX`. On some file systems (lustre) the
      default size can be 4M which becomes a problem when splitting files
      with 10+ samples.

    - Add support for multisample output and sample renaming

* bcftools +split-vep:

    - Add default types (Integer, Float, String) for VEP subfields and make
      `--columns -` extract all subfields into INFO tags in one go.


## Release 1.10.2 (19th December 2019)

This is a release fix that corrects minor inconsistencies discovered in
previous deliverables.


## Release 1.10 (6th December 2019)


* Numerous bug fixes, usability improvements and sanity checks were added
  to prevent common user errors.

* The -r, --regions (and -R, --regions-file) option should never create
  unsorted VCFs or duplicates records again. This also fixes rare cases where
  a spanning deletion makes a subsequent record invisible to `bcftools isec`
  and other commands.

* Additions to filtering and formatting expressions

    - support for the spanning deletion alternate allele (ALT=*)

    - new ILEN filtering expression to be able to filter by indel length

    - new MEAN, MEDIAN, MODE, STDEV, phred filtering functions

    - new formatting expression %PBINOM (phred-scaled binomial probability),
      %INFO (the whole INFO column), %FORMAT (the whole FORMAT column),
      %END (end position of the REF allele), %END0 (0-based end position
      of the REF allele), %MASK (with multiple files indicates the presence
      of the site in other files)

* New plugins

    - `+gvcfz`: compress gVCF file by resizing gVCF blocks according to
      specified criteria

    - `+indel-stats`: collect various indel-specific statistics

    - `+parental-origin`: determine parental origin of a CNV region

    - `+remove-overlaps`: remove overlapping variants.

    - `+split-vep`: query structured annotations such INFO/CSQ created by
      bcftools/csq or VEP

    - `+trio-dnm`: screen variants for possible de-novo mutations in trios

* `annotate`

    - new -l, --merge-logic option for combining multiple overlapping regions

* `call`

    - new `bcftools call -G, --group-samples` option which allows grouping
      samples into populations and applying the HWE assumption within but
      not across the groups.

* `csq`

    - significant reduction of memory usage in the local -l mode for VCFs
      with thousands of samples and 20% reduction in the non-local
      haplotype-aware mode.

    - fixes a small memory leak and formatting issue in FORMAT/BCSQ at
      sites with many consequences

    - do not print protein sequence of start_lost events

    - support for "start_retained" consequence

    - support for symbolic insertions (ALT="<INS...>"), "feature_elongation"
      consequence

    - new -b, --brief-predictions option to output abbreviated protein
      predictions.

* `concat`

    - the `--naive` command now checks header compatibility when concatenating
      multiple files.

* `consensus`

    - add a new `-H, --haplotype 1pIu/2pIu` feature to output first/second
      allele for phased genotypes and the IUPAC code for unphased genotypes

    - new -p, --prefix option to add a prefix to sequence names on output

* `+contrast`

    - added support for Fisher's test probability and other annotations

* `+fill-from-fasta`

    - new -N, --replace-non-ACGTN option

* `+dosage`

    - fix some serious bugs in dosage calculation

* `+fill-tags`

    - extended to perform simple on-the-fly calculations such as calculating
      INFO/DP from FORMAT/DP.

* `merge`

    - add support for merging FORMAT strings

    - bug fixed in gVCF merging

* `mpileup`

    - a new optional SCR annotation for the number of soft-clipped reads

* `reheader`

    - new -f, --fai option for updating contig lines in the VCF header

* `+trio-stats`

    - extend output to include DNM homs and recurrent DNMs

* VariantKey support



## Release 1.9 (18th July 2018)

* `annotate`

    - REF and ALT columns can be now transferred from the annotation file.

    - fixed bug when setting vector_end values.

* `consensus`

    - new -M option to control output at missing genotypes

    - variants immediately following insersions should not be skipped.  Note
      however, that the current fix requires normalized VCF and may still
      falsely skip variants adjacent to multiallelic indels.

    - bug fixed in -H selection handling

* `convert`

    - the --tsv2vcf option now makes the missing genotypes diploid, "./."
      instead of "."

    - the behavior of -i/-e with --gvcf2vcf changed. Previously only sites with
      FILTER set to "PASS" or "." were expanded and the -i/-e options dropped
      sites completely. The new behavior is to let the -i/-e options control
      which records will be expanded. In order to drop records completely,
      one can stream through "bcftools view" first.

* `csq`

    - since the real consequence of start/splice events are not known,
      the amino acid positions at subsequent variants should stay unchanged

    - add `--force` option to skip malformatted transcripts in GFFs with
      out-of-phase CDS exons.

* `+dosage`: output all alleles and all their dosages at multiallelic sites

* `+fixref`: fix serious bug in -m top conversion

* `-i/-e` filtering expressions:

    - add two-tailed binomial test

    - add functions N_PASS() and F_PASS()

    - add support for lists of samples in filtering expressions, with many
      samples it was impractical to list them all on the command line. Samples
      can be now in a file as, e.g., GT[@samples.txt]="het"

    - allow multiple perl functions in the expressions and some bug fixes

    - fix a parsing problem, '@' was not removed from '@filename' expressions

* `mpileup`: fixed bug where, if samples were renamed using the `-G`
  (`--read-groups`) option, some samples could be omitted from the output file.

* `norm`: update INFO/END when normalizing indels

* `+split`: new -S option to subset samples and to use custom file names
  instead of the defaults

* `+smpl-stats`: new plugin

* `+trio-stats`: new plugin

* Fixed build problems with non-functional configure script produced on
  some platforms


## Release 1.8 (April 2018)

* `-i, -e` filtering: Support for custom perl scripts

* `+contrast`: New plugin to annotate genotype differences between groups
  of samples

* `+fixploidy`: New options for simpler ploidy usage

* `+setGT`: Target genotypes can be set to phased by giving `--new-gt p`

* `run-roh.pl`: Allow to pass options directly to `bcftools roh`

* Number of bug fixes


## Release 1.7 (February 2018)

* `-i, -e` filtering: Major revamp, improved filtering by FORMAT fields
  and missing values. New GT=ref,alt,mis etc keywords, check the documentation
  for details.

* `query`: Only matching expression are printed when both the -f and -i/-e
  expressions contain genotype fields. Note that this changes the original
  behavior. Previously all samples were output when one matching sample was
  found. This functionality can be achieved by pre-filtering with view and then
  streaming to query. Compare
        bcftools query -f'[%CHROM:%POS %SAMPLE %GT\n]' -i'GT="alt"' file.bcf
  and
        bcftools view -i'GT="alt"' file.bcf -Ou | bcftools query -f'[%CHROM:%POS %SAMPLE %GT\n]'

* `annotate`: New -k, --keep-sites option

* `consensus`: Fix --iupac-codes output

* `csq`: Homs always considered phased and other fixes

* `norm`: Make `-c none` work and remove `query -c`

* `roh`: Fix errors in the RG output

* `stats`: Allow IUPAC ambiguity codes in the reference file; report the number of missing genotypes

* `+fill-tags`: Add ExcHet annotation

* `+setGt`: Fix bug in binom.test calculation, previously it worked only for nAlt<nRef!

* `+split`: New plugin to split a multi-sample file into single-sample files in one go

* Improve python3 compatibility in plotting scripts



## Release 1.6 (September 2017)

* New `sort` command.

* New options added to the `consensus` command. Note that the `-i, --iupac`
  option has been renamed to `-I, --iupac`, in favor of the standard
  `-i, --include`.

* Filtering expressions (`-i/-e`): support for `GT=<type>` expressions and
  for lists and ranges (#639) - see the man page for details.

* `csq`: relax some GFF3 parsing restrictions to enable using Ensembl
  GFF3 files for plants (#667)

* `stats`: add further documentation to output stats files (#316) and
  include haploid counts in per-sample output (#671).

* `plot-vcfstats`: further fixes for Python3 (@nsoranzo, #645, #666).

* `query` bugfix (#632)

* `+setGT` plugin: new option to set genotypes based on a two-tailed binomial
  distribution test. Also, allow combining `-i/-e` with `-t q`.

* `mpileup`: fix typo (#636)

* `convert --gvcf2vcf` bugfix (#641)

* `+mendelian`: recognize some mendelian inconsistencies that were
  being missed (@oronnavon, #660), also add support for multiallelic
  sites and sex chromosomes.


## Release 1.5 (June 2017)

* Added autoconf support to bcftools. See `INSTALL` for more details.

* `norm`: Make norm case insensitive (#601). Trim the reference allele (#602).

* `mpileup`: fix for misreported indel depths for reads containing adjacent
  indels (3c1205c1).

* `plot-vcfstats`: Open stats file in text mode, not binary (#618).

* `fixref` plugin: Allow multiallelic sites in the `-i, --use-id reference`.
  Also flip genotypes, not just REF/ALT!

* `merge`: fix gVCF merge bug when last record on a chromosome opened a
  gVCF block (#616)

* New options added to the ROH plotting script.

* `consensus`: Properly flush chain info (#606, thanks to @krooijers).

* New `+prune` plugin for pruning sites by LD (R2) or maximum number of
  records within a window.

* New N_MISSING, F_MISSING (number and fraction missing) filtering
  expressions.

* Fix HMM initialization in `roh` when snapshots are used in multiple
  chromosome VCF.

* Fix buffer overflow (#607) in `filter`.


## Release 1.4.1 (8 May 2017)

* `roh`: Fixed malfunctioning options `-m, --genetic-map` and `-M, --rec-rate`,
  and newly allowed their combination. Added a convenience wrapper `misc/run-roh.pl`
  and an interactive script for visualizing the calls `misc/plot-roh.py`.

* `csq`: More control over warning messages (#585).

* Portability improvements (#587). Still work to be done on this front.

* Add support for breakends to `view`, `norm`, `query` and filtering (#592).

* `plot-vcfstats`: Fix for python 2/3 compatibility (#593).

* New `-l, --list` option for `+af-dist` plugin.

* New `-i, --use-id` option for `+fix-ref` plugin.

* Add `--include/--exclude` options to `+guess-ploidy` plugin.

* New `+check-sparsity` plugin.

* Miscellaneous bugfixes for #575, #584, #588, #599, #535.


## Release 1.4 (13 March 2017)

Two new commands - `mpileup` and `csq`:

* The `mpileup` command has been imported from samtools to bcftools. The
  reasoning behind this is that bcftools calling is intimately tied to mpileup
  and any changes to one, often requires changes to the other. Only the
  genotype likelihood (BCF output) part of mpileup has moved to bcftools,
  while the textual pileup output remains in samtools. The BCF output option
  in `samtools mpileup` will likely be removed in a release or two or when
  changes to `bcftools call` are incompatible with the old mpileup output.

  The basic mpileup functionality remains unchanged as do most of the command
  line options, but there are some differences and new features that one
  should be aware of:

  - The option `samtools mpileup -t, --output-tags` changed to `bcftools
    mpileup -a, --annotate` to avoid conflict with the `-t, --targets`
    option common across other bcftools commands.

  - `-O, --output-BP` and `-s, --output-MQ` are no longer used as they are
    only for textual pipelup output, which is not included in `bcftools
    mpileup`. `-O` short option reassigned to `--output-type` and `-s`
    reassigned to `--samples` for consistency with other bcftools commands.

  - `-g, --BCF`, `-v, --VCF`, and ` -u, --uncompressed` options from
    `samtools mpileup` are no longer used, being replaced by the
    `-O, --output-type` option common to other bcftools commands.

  - The `-f, --fasta-ref` option is now required by default to help avoid user
    errors. Can be disabled using `--no-reference`.

  - The option `-d, --depth .. max per-file depth` now behaves as expected
    and according to the documentation, and prints a meaningful diagnostics.

  - The `-S, --samples-file` can be used to rename samples on the fly. See man
    page for details.

  - The `-G, --read-groups` functionality has been extended to allow
    reassignment, grouping and exclusion of readgroups. See man page for
    details.

  - The `-l, --positions` replaced by the `-t, --targets` and
    `-T, --targets-file` options to be consistent with other bcftools
    commands.

  - gVCF output is supported. Per-sample gVCFs created by mpileup can be
    merged using `bcftools merge --gvcf`.

  - Can generate mpileup output on multiple (indexed) regions using the
    `-r, --regions` and `-R, --regions-file` options. In samtools, one
    was restricted to a single region with the `-r, --region` option.

  - Several speedups thanks to @jkbonfield (cf3a55a).

* `csq`: New command for haplotype-aware variant consequence calling.
  See man page and [paper](https://www.ncbi.nlm.nih.gov/pubmed/28205675).


Updates, improvements and bugfixes for many other commands:

* `annotate`: `--collapse` option added. `--mark-sites` now works with
  VCF files rather than just tab-delimited files. Now possible to annotate
  a subset of samples from tab file, not just VCF file (#469). Bugfixes (#428).

* `call`: New option `-F, --prior-freqs` to take advantage of prior knowledge
    of population allele frequencies. Improved calculation of the QUAL score
    particularly for REF sites (#449, 7c56870). `PLs>=256` allowed in
    `call -m`. Bugfixes (#436).

* `concat --naive` now works with vcf.gz in addition to bcf files.

* `consensus`: handle variants overlapping region boundaries (#400).

* `convert`: gvcf2vcf support for mpileup and GATK. new `--sex` option to
  assign sex to be used in certain output types (#500). Large speedup of
  `--hapsample` and `--haplegendsample` (e8e369b) especially with `--threads`
  option enabled. Bugfixes (#460).

* `cnv`: improvements to output (be8b378).

* `filter`: bugfixes (#406).

* `gtcheck`: improved cross-check mode (#441).

* `index` can now specify the path to the output index file. Also, gains the
   `--threads` option.

* `merge`: Large overhaul of `merge` command including support for merging
  gVCF files created by `bcftools mpileup --gvcf` with the new `-g, --gvcf`
  option. New options `-F` to control filter logic and `-0` to set missing
  data to REF. Resolved a number of longstanding issues (#296, #361, #401,
  #408, #412).

* `norm`: Bugfixes (#385,#452,#439), more informative error messages (#364).

* `query`: `%END` plus `%POS0`, `%END0` (0-indexed) support - allows easy BED
  format output (#479). `%TBCSQ` for use with the new `csq` command. Bugfixes
  (#488,#489).

* `plugin`: A number of new plugins:

  - `GTsubset` (thanks to @dlaehnemann)
  - `ad-bias`
  - `af-dist`
  - `fill-from-fasta`
  - `fixref`
  - `guess-ploidy` (deprecates `vcf2sex` plugin)
  - `isecGT`
  - `trio-switch-rate`

  and changes to existing plugins:

  - `tag2tag`: Added `gp-to-gt`, `pl-to-gl` and `--threshold` options and
    bugfixes (#475).
  - `ad-bias`: New `-d` option for minimum depth.
  - `impute-info`: Bugfix (49a9eaf).
  - `fill-tags`:  Added ability to aggregate tags for sample subgroups, thanks
    to @mh11. (#503). HWE tag added as an option.
  - `mendelian`: Bugfix (#566).

* `reheader`: allow muiltispace delimiters in `--samples` option.

* `roh`: Now possible to process multiple samples at once. This allows
  considerable speedups for files with thousands of samples where the cost of
  HMM is neglibible compared to I/O and decompressing. In order to fit tens of
  thousands samples in memory, a sliding HMM can be used (new `--buffer-size`
  option). Viterbi training now uses Baum-Welch algorithm, and works much
  better. Support for gVCFs or FORMAT/PL tags. Added `-o, output` and
  `-O, --output-type` options to control output of sites or regions
  (compression optional). Many bugs fixed - do not segfault on missing PL
  values anymore, a typo in genetic map calculation resulted in a slowdown and
  incorrect results.

* `stats`: Bugfixes (16414e6), new options `-af-bins` and `-af-tags` to control
  allele frequency binning of output. Per-sample genotype concordance tables
  added (#477).

* `view -a, --trim-alt-alleles` various bugfixes for missing data and more
  informative errors should now be given on failure to pinpoint problems.


General changes:

* Timestamps are now added to header lines summarising the command (#467).

* Use of the `--threads` options should be faster across the board thanks to
  changes in HTSlib meaning meaning threads are now shared by the compression
  and decompression calls.

* Changes to genotype filtering with `-i, --include` and `-e, --exclude` (#454).


## Noteworthy changes in release 1.3.1 (22 April 2016)

* The `concat` command has a new `--naive` option for faster operations on
  large BCFs (PR #359).
* `GTisec`: new plugin courtesy of David Laehnemann (@dlaehnemann) to count
  genotype intersections across all possible sample subsets in a VCF file.
* Numerous VCF parsing fixes.
* Build fix: _peakfit.c_ now builds correctly with GSL v2 (#378).
* Various bug fixes and improvements to the `annotate` (#365), `call` (#366),
  `index` (#367), `norm` (#368, #385), `reheader` (#356), and `roh` (#328)
  commands, and to the `fill-tags` (#345) and `tag2tag` (#394) plugins.
* Clarified documentation of `view` filter options, and of the
  `--regions-file` and `--targets-file` options (#357, #411).


## Noteworthy changes in release 1.3 (15 December 2016)

* `bcftools call` has new options `--ploidy` and `--ploidy-file` to make
  handling sample ploidy easier. See man page for details.
* `stats`: `-i`/`-e` short options changed to `-I`/`-E` to be consistent with
  the filtering `-i`/`-e` (`--include`/`--exclude`) options used in other
  tools.
* general `--threads` option to control the number of output compression
  threads used when outputting compressed VCF or BCF.
* `cnv` and `polysomy`: new commands for detecting CNVs, aneuploidy, and
  contamination from SNP genotyping data.
* various new options, plugins, and bug fixes, including #84, #201, #204,
  #205, #208, #211, #222, #225, #242, #243, #249, #282, #285, #289, #302,
  #311, #318, #336, and #338.


## Noteworthy changes in release 1.2 (2 February 2016)

* new `bcftools consensus` command
* new `bcftools annotate` plugins: fixploidy, vcf2sex, tag2tag
* more features in `bcftools convert` command, amongst others new
  `--hapsample` function (thanks to Warren Kretzschmar @wkretzsch)
* support for complements in `bcftools annotate --remove`
* support for `-i`/`-e` filtering expressions in `bcftools isec`
* improved error reporting
* `bcftools call`
  - the default prior increased from `-P 1e-3` to `-P 1.1e-3`, some clear
    calls were missed with default settings previously
  - support for the new symbolic allele `<*>`
  - support for `-f GQ`
  - bug fixes, such as: proper trimming of DPR tag with `-c`; the `-A` switch
    does not add back records removed by `-v` and the behaviour has been made
    consistent with `-c` and `-m`
* many bug fixes and improvements, such as
  - bug in filtering, FMT & INFO vs INFO & FMT
  - fixes in `bcftools merge`
  - filter update AN/AC with `-S`
  - isec outputs matching records for both VCFs in the Venn mode
  - annotate considers alleles when working with `Number=A,R` tags
  - new `--set-id` feature for annotate
  - `convert` can be used similarly to `view`