diff options
author | Samantha McVey <samantham@posteo.net> | 2016-11-09 20:57:13 -0800 |
---|---|---|
committer | Samantha McVey <samantham@posteo.net> | 2016-11-09 20:57:13 -0800 |
commit | 70d6977ccc07b667da0ed165b7706afbb5190816 (patch) | |
tree | de04ea65d3bd347bb495feb5c43aadfdd95890de /tests | |
parent | 58ffb4057fcd67e5c0ba33f76344cc8ad927c72b (diff) |
Use charlock_holmes to do encoding detection. In my tests it has properly identified incorrect encodings that used to be present on older commits. This will help ensure this won't happen again, giving people instant feedback and allowing all pull requests to be checked
Diffstat (limited to 'tests')
-rw-r--r-- | tests/encoding.rb | 16 |
1 files changed, 10 insertions, 6 deletions
diff --git a/tests/encoding.rb b/tests/encoding.rb index c4d41d19..ae7e495f 100644 --- a/tests/encoding.rb +++ b/tests/encoding.rb @@ -1,14 +1,18 @@ #!/usr/bin/env ruby +require 'charlock_holmes' $file_count = 0; markdown_files = Dir["./**/*.html.markdown"] markdown_files.each do |file| begin - file_bin = File.open(file, "rb") - contents = file_bin.read - if ! contents.valid_encoding? - puts "#{file} has an invalid encoding! Please save the file in UTF-8!" - else + contents = File.read(file) + detection = CharlockHolmes::EncodingDetector.detect(contents) + case detection[:encoding] + when 'UTF-8' + $file_count = $file_count + 1 + when 'ISO-8859-1' $file_count = $file_count + 1 + else + puts "#{file} was detected as #{detection[:encoding]} encoding! Please save the file in UTF-8!" end rescue Exception => msg puts msg @@ -20,6 +24,6 @@ if files_failed != 0 puts "Please resave the file as UTF-8." exit 1 else - puts "Success. All #{$file_count} files passed UTF-8 validity checks" + puts "Success. All #{$file_count} files Ruby's UTF-8 validity checks. This won't catch most problems." exit 0 end |