diff options
author | Andrej Shadura <andrew.shadura@collabora.co.uk> | 2019-12-13 14:54:38 +0100 |
---|---|---|
committer | Andrej Shadura <andrew.shadura@collabora.co.uk> | 2019-12-13 14:54:59 +0100 |
commit | c0738b9d9944ec9687a75f3a24c4c02544ba301b (patch) | |
tree | 6bd73c2a1c81c193bc339faf93edfc9c7f1f9522 | |
parent | ef19b373cb3d288524d07c4aaee98a264d9fee4a (diff) |
Prevent some of the binary junk from being dumped into the output as is
When licensecheck processes binary files, it doesn’t try to detect
the format, so it often catches random binary data and think it’s
the copyright information. Detect some non-printable (U+0000 to U+001F)
characters in the copyright information and cut off on the first such
character.
-rw-r--r-- | lib/Dpkg/Copyright/Scanner.pm | 1 | ||||
-rw-r--r-- | t/scanner/examples/binary-copyright-empty.in | bin | 0 -> 90 bytes | |||
-rw-r--r-- | t/scanner/examples/binary-copyright-empty.out | 4 | ||||
-rw-r--r-- | t/scanner/examples/binary-copyright.in | bin | 0 -> 12330 bytes | |||
-rw-r--r-- | t/scanner/examples/binary-copyright.out | 4 |
5 files changed, 9 insertions, 0 deletions
diff --git a/lib/Dpkg/Copyright/Scanner.pm b/lib/Dpkg/Copyright/Scanner.pm index b35af999..453825a3 100644 --- a/lib/Dpkg/Copyright/Scanner.pm +++ b/lib/Dpkg/Copyright/Scanner.pm @@ -282,6 +282,7 @@ sub scan_files ( %args ) { $c =~ s!^[\s,/*]|[\s,#/*-]+$!!g; $c =~ s/--/-/g; $c =~ s!\s+\*/\s+! !; + $c =~ s![\x00-\x1f].*!!; # cut off everything after and including the first non-printable $c = __pack_copyright($c); diff --git a/t/scanner/examples/binary-copyright-empty.in b/t/scanner/examples/binary-copyright-empty.in Binary files differnew file mode 100644 index 00000000..55fb1115 --- /dev/null +++ b/t/scanner/examples/binary-copyright-empty.in diff --git a/t/scanner/examples/binary-copyright-empty.out b/t/scanner/examples/binary-copyright-empty.out new file mode 100644 index 00000000..2ddf1439 --- /dev/null +++ b/t/scanner/examples/binary-copyright-empty.out @@ -0,0 +1,4 @@ +Files: * +Copyright: UNKNOWN +License: UNKNOWN + diff --git a/t/scanner/examples/binary-copyright.in b/t/scanner/examples/binary-copyright.in Binary files differnew file mode 100644 index 00000000..b11bfb88 --- /dev/null +++ b/t/scanner/examples/binary-copyright.in diff --git a/t/scanner/examples/binary-copyright.out b/t/scanner/examples/binary-copyright.out new file mode 100644 index 00000000..05193e3a --- /dev/null +++ b/t/scanner/examples/binary-copyright.out @@ -0,0 +1,4 @@ +Files: * +Copyright: ÿyÿt}kÿoiEÿiP +License: UNKNOWN + |