Talk Proposal Submission

If you are interested in attending this talk at PyCon JP 2017, please use the social media share buttons below. We will consider the popularity of the proposals when making our selection.


Using machine learning to try and predict taxi availability(en)


Hari Allamraju

Audience level:



Big Data


In this talk we will use the taxi availability data from Singapore to learn how we can predict taxi availability with machine learning, and also discuss how such information might be used to help consumers and taxi companies


The audience can expect to get the following from this talk - 1) See how we can apply machine learning to real life data 2) Get an idea on the issues faced and lessons learnt 3) Get pointers to discuss how a consumer or taxi company might use such predictions for their benefit


Taxi's nowadays are equipped with devices which an provide their location very accurately. These can be used to get a snapshot of taxi availability at any point of time. The Singapore government provides an open API which can give us a snapshot of the taxi availability in the form of the taxi locations across Singapore. This is very useful to get a current snapshot of the data. By querying the API at periodic intervals we can build a picture of how the availability changes across Singapore. These changes will include various variables like drivers moving around looking for riders, taxis getting hired, taxis dropping off people etc. If we analyze the data and apply machine learning to these data snapshots taken over a few days, we can try and predict the taxi availability at any location for a given time of the day. The information which we learn from such an analysis can be combined with other data sets like weather, rider demand, any news events etc and understand or predict how people will move across the city. This can be very useful for consumers, taxi companies and even government. Such systems probably already exist at the major taxi and ride sharing companies. So this talk will focus on the following aspects to enable the audience to learn more about these systems - - Data collection - Processing data to a format we can use - Identifying the parameters that we can learn/analyze from the data - Provide a few example of the analysis - Present the results - A brief discussion on how the data can be used in conjunction with other data sets At the end of the talk the audience can use the slides and information as a reference in case they want to perform such analysis on their own or use it for learning. We can also learn from the comments of any audience members who have worked on such systems and who may want to add some information during the Q&A section.
  • このエントリーをはてなブックマークに追加