Here's the code and results ...
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def histogram(freqeuncy) | |
freqeuncy.each do |c,v| | |
puts "#{c}: #{'*'*(v*100).to_i}" | |
end | |
end | |
File.open(ARGV[0], 'r') do |f| | |
freqeuncy = {} | |
('A'..'Z').each { |c| freqeuncy[c] = 0 } | |
total_characters = 0 | |
f.each_char do |c| | |
if c.upcase =~ /[A-Z]/ | |
total_characters += 1 | |
freqeuncy[c.upcase] += 1 | |
end | |
end | |
percentage_freqeuncy = freqeuncy.map { |c, v| [c, v.to_f / total_characters.to_f] } | |
histogram(percentage_freqeuncy) | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
A: ******** | |
B: * | |
C: ** | |
D: **** | |
E: ************ | |
F: ** | |
G: ** | |
H: ****** | |
I: ****** | |
J: | |
K: | |
L: **** | |
M: ** | |
N: ****** | |
O: ******* | |
P: * | |
Q: | |
R: ***** | |
S: ****** | |
T: ********* | |
U: ** | |
V: | |
W: ** | |
X: | |
Y: * | |
Z: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
A: ****** | |
B: * | |
C: *** | |
D: ***** | |
E: ************* | |
F: ** | |
G: * | |
H: ** | |
I: ****** | |
J: | |
K: | |
L: ***** | |
M: ** | |
N: ****** | |
O: ******* | |
P: ** | |
Q: | |
R: ******* | |
S: ******* | |
T: ******** | |
U: *** | |
V: | |
W: * | |
X: | |
Y: | |
Z: |
As you can see, the histograms are pretty similar. For this particular Rails project, I was using HAML, so I'm not sure if there'd be any differences if you used erb or not. Also, as I noted, this was a fairly small project.
The code runs against a single file that's passed in on the command line. So run it like this ...
ruby letter_freqency.rb moby10b.txt > moby_results.txt
for example. For a rails project, I cat'd all the files together and then ran the code like this ...
cat `find . -iname \*.rb -or -iname \*.haml` > rails_files.txt
ruby letter_freqency.rb rails_files.txt > rails_results.txt
Let me know if you try this on one of your projects and post the results in the comments.
No comments:
Post a Comment