summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAndrej Shadura <andrew.shadura@collabora.co.uk>2019-12-13 14:54:38 +0100
committerAndrej Shadura <andrew.shadura@collabora.co.uk>2019-12-13 14:54:59 +0100
commitc0738b9d9944ec9687a75f3a24c4c02544ba301b (patch)
tree6bd73c2a1c81c193bc339faf93edfc9c7f1f9522
parentef19b373cb3d288524d07c4aaee98a264d9fee4a (diff)
Prevent some of the binary junk from being dumped into the output as is
When licensecheck processes binary files, it doesn’t try to detect the format, so it often catches random binary data and think it’s the copyright information. Detect some non-printable (U+0000 to U+001F) characters in the copyright information and cut off on the first such character.
-rw-r--r--lib/Dpkg/Copyright/Scanner.pm1
-rw-r--r--t/scanner/examples/binary-copyright-empty.inbin0 -> 90 bytes
-rw-r--r--t/scanner/examples/binary-copyright-empty.out4
-rw-r--r--t/scanner/examples/binary-copyright.inbin0 -> 12330 bytes
-rw-r--r--t/scanner/examples/binary-copyright.out4
5 files changed, 9 insertions, 0 deletions
diff --git a/lib/Dpkg/Copyright/Scanner.pm b/lib/Dpkg/Copyright/Scanner.pm
index b35af999..453825a3 100644
--- a/lib/Dpkg/Copyright/Scanner.pm
+++ b/lib/Dpkg/Copyright/Scanner.pm
@@ -282,6 +282,7 @@ sub scan_files ( %args ) {
$c =~ s!^[\s,/*]|[\s,#/*-]+$!!g;
$c =~ s/--/-/g;
$c =~ s!\s+\*/\s+! !;
+ $c =~ s![\x00-\x1f].*!!; # cut off everything after and including the first non-printable
$c = __pack_copyright($c);
diff --git a/t/scanner/examples/binary-copyright-empty.in b/t/scanner/examples/binary-copyright-empty.in
new file mode 100644
index 00000000..55fb1115
--- /dev/null
+++ b/t/scanner/examples/binary-copyright-empty.in
Binary files differ
diff --git a/t/scanner/examples/binary-copyright-empty.out b/t/scanner/examples/binary-copyright-empty.out
new file mode 100644
index 00000000..2ddf1439
--- /dev/null
+++ b/t/scanner/examples/binary-copyright-empty.out
@@ -0,0 +1,4 @@
+Files: *
+Copyright: UNKNOWN
+License: UNKNOWN
+
diff --git a/t/scanner/examples/binary-copyright.in b/t/scanner/examples/binary-copyright.in
new file mode 100644
index 00000000..b11bfb88
--- /dev/null
+++ b/t/scanner/examples/binary-copyright.in
Binary files differ
diff --git a/t/scanner/examples/binary-copyright.out b/t/scanner/examples/binary-copyright.out
new file mode 100644
index 00000000..05193e3a
--- /dev/null
+++ b/t/scanner/examples/binary-copyright.out
@@ -0,0 +1,4 @@
+Files: *
+Copyright: ÿyŠÿt}kÿoiEÿiP
+License: UNKNOWN
+