Free Data Feed - extra bits
We have talked about how to
get the ticker names for the US exchanges.
We have talked about how to
get the data for those tickers.
Now here is a quick bit about variations on that last part - getting the data.
First up, if we want to get just a single ticker file, then the following script allows us to pass in the ticker name via the command line
instead of reading in many via a big list. You would call this with "perl -w getSingleHistory.pl KO" if you wanted to get all of the history
for Coke and you were in the directory of the script*.
#getSingleHistory.pl
use strict;
use LWP::Simple;#might eventually need a UserAgent if Yahoo starts requiring it
my $stockDir = 'stocks/';#the directory we are writing the stocks into
my $ticker = shift(@ARGV);#incoming from the command line.
my $data = '';
#now we will grab the data
print "$ticker\n";#comment out this line if you don't want to see the progress
$data = get('http://itable.finance.yahoo.com/table.csv?s=' . $ticker);
#write it out to disk (clobber)
open(OUTFILE, '+>' , $stockDir . $ticker) or die "Could not write out data file: $!\n";
print OUTFILE $data;
close(OUTFILE) or die "Could not close data file: $!\n";
#and done
Another thing we noted in the previous post was that looking over the data, you will see that some of the files don't have data in them.
This could be because Yahoo didn't return the data to you properly for any number of reasons, or simply because the ticker doesn't exist
any longer. One's first thought might be to write a script to delete all of the files which don't have any data - which makes sense - no
sense keeping them around. But the problem is if you find one that doesn't have data and then manually look it up on the data feed link -
you very well may find that it loads data just fine for you. So that means there weas some non-fatal error that took place. The data is out
there and we want it - so no need to delete the file. Instead we just want to have a record of which files don't have data in them. If you
are on a Windows system or really any system that allows a graphical view of the filesystem, then you could sort the files by size and the
ones that are considerably smaller than the rest are ones that you will want to take note of. If there are only a few, then you could just
look right then. But there might be many of them, so for the most part it is probably better for a script to make a list of them for you to
refer to later.
#cleanup.pl
use strict;
my $count = 0;
my $file = '';
my @questionableFiles = ();
#get all of the stock filenames
my @filenames = ;
#iterate over them
for $file (@filenames){
#get their linecount
$count = `wc -l < $file`;
die "wc failed: $?" if $?;
chop($count);
$count = int($count);
if($count < 10){
#if in here, we should look into it
#get all text after "stocks/"
$file = substr($file,7);
#put it into an array
push(@questionableFiles, $file);
}
}
#now dump that out to a file
open(OUTPUT,'+>', 'questionableFiles.txt') or die "Bad things man: $!\n";
for(@questionableFiles){
print OUTPUT "$_\n";
}
close(OUTPUT) or die "Couldn't close the file: $!\n";
#and done
*This is actually an interesting point. For these scripts, you should be in the direcotry of the scripts since that is what the scripts assume. But later if and when
we want these to be called by cron, then we are going to need full paths used in the scripts. So keep that in mind if we move to calling them via cron and you are seeing
issues with them not working properly that way, but they work just fine when you call them yourself in the directory.
**NOTE**
It appears the URL that was working when I tested this is currently not really working as well. So I am now saying to use this one, which does appear to work: "http://itable.finance.yahoo.com/table.txt?s=TICKER&a=1&b=1&c=1998&ignore=.txt"
And like before, replace TICKER with whatever stock ticker.
Posted by ESS at February 25, 2005 05:13 PM
| TrackBack