How to Create Youtube Crawler

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    How to Create Youtube Crawler

    Hi Guys,
    I am new to youtube data api i want to make a crawler to grab "video title","Video link","video discription" and "Video thumbnail link" and save them to database so can anyone help me on this forum.

    #2
    id give up lol, heres a little math based on results of april 2008. at said month/year there was an estimated 83.4 million videos on youtube,

    now lets say i make up some averages (averages will be less than real averages i imagine)

    title span: 20 characters
    link: 10 chars (since we dont need the full url stored we only need the unique id)
    description: 50 characters
    thumb link : we dont need this as the url can be parsed using the link id as both link and thumb link are static urls with a id in them

    ok so each video record will need a minimum of 80 chars, thats 80 bytes per record

    80 bytes x 83400000 = 6672000000 bytes

    6672000000/1024/1024/1024 = 6.2 GB

    and as i said my averages are well below what they really are, on top of this you can X2 if you use a larger charset and you can add even more when you start having to create table indexes and any more data you add to it.

    Are you really prepared to even try a 6.2GB table and then see how long it takes to search through the damn thing. In reality what you;d have to do is have a minimum of 80 tables to split the records up.

    Can it be done? Yes
    Should you do it? Not if theres any other way
    Will we do it for you? Maybe
    Will we do it properly so its actually useable ? Doubtful

    you'd be much better of using a JIT code to pull the data from the api, such as

    Code:
    //=== get feed
    $searchterm = $_POST['searchterm'];
    $feed = file_get_contents("http://gdata.youtube.com/feeds/api/videos/-/$searchterm");
    Also dont forget that 1000's of videos are updated daily, so that means to keep up to date you need either a latest uploads api, or an intelligent system.

    If your hellbent on doing it id suggest you create a loop or use a password cracker to generate an id list 10 chars long with all the possible combinations of A-Z and 0-9 and then see if that id exists. But i'll reiterate, its not worth doing it

    Comment

    Working...
    X