Custom APIs and Web Scraping for Science

So my team’s most recent application, Helix, involved genome visualization. We integrated it with the 23andme API, but still needed a way to find out interesting information about specific RSIDs (used by researchers and databases to refer to specific base pairs of DNA). By far the most useful and open source repository of genetic information is SNPedia, but I needed access to lots of information and to integrate calls to specific SNPs. Basically I needed an API. So being ever resourceful, I decided to make my own.

Tools for the task were an easy choice. I needed a small fast server that I could implement a web scrapper on. I have always wanted a reason to use BeautifulSoup, but it’s a Python library so I knew it would be easier to build a Python server to run the API endpoints. I chose Flask because of its lightweight nature and how much it reminds me of a Node/Express server at times.

Thankfully there are some really good tutorials for both Flask and BeautifulSoup, my favorites (and the ones I referenced when I hit weirdness) were Designing a RESTful API and Website Scraping with BeautifulSoup. Both of these tutorials said a lot of things better than I could have myself.

For access to my SNPedia API and information on how to use it, check out my project on GitHub.


Lindsey Learns APIs

Starting this countdown to California/Hack Reactor has brought back my love of learning in ways that online classes couldn’t do. I love to learn. I’m the kind of girl that if I don’t know the answer to your question (“Are inch worms an inch long?” - bad example as I do know that they get their name from the way they “inch” forward, but still), I will stop the conversation immediately, whip out my phone and find you the answer. I’m sure I’ve pissed off friends and relatives, but I think they are used to me by now. It’s something that I just have to do. Some people will make up things or say “who cares”, but I really want to get down to the truth, learn a new bit of trivia, something.

So now that I’m learning more things I keep coming up with questions. Like suddenly the veil has been lifted and JSON isn’t this mythical acronym but something I can actually parse and use. But my brain of course took it a step further. APIs are similar, right? - said my brain. But what exactly are they? I’ve pulled data out of one but it was very hand-holdy. How does that data even get there? How does an app get an API? To Google, my friends.

According to API Evangelist, APIs are tools individuals can use to access companies’/individual apps’ data and functions. They allow external users to access internal info safely. There’s _lots_of stuff on this page though and my eyes kinda glazed over about half a page down. Time for a new track.

I learn better by doing so maybe make our own little API? Unsurprisingly, the internet has us covered. Let’s Create your first API. This is in PHP and very straightforward. But if you’re like me, you may want something in our own favorite web app language: JavaScript! So let’s go ahead and try Creating a REST API Using Node.js.

Obviously I’ve only just scratched the surface of this stuff but I want to try and play around with this and more in my last two weeks of down time. One last link for good measure: How to Design a Good API