Big Data Kenya


Big Data Kenya participants on campus at TU Kenya (all photos taken by Brian Matoke)

Carringtone Kinyanjui and Bonface Osoro are former DARA Big Data and DARA Project students who returned to their native Kenya last year. Both of them used data science and machine learning heavily in their Masters studies and want their peers in Kenya to have access to the same skillsets that they’ve been able to develop. They had the idea of holding an intensive 5-day workshop in Nairobi which would cover a variety of topics, including working in industry, data science concepts and solving problems using programming. The Technical University of Kenya (TUK) was chosen as the venue as both Carringtone and Bonface are tutors there and have academic support from Prof Paul Baki of the Department of Astronomy and Space Science. 


A third tutor was put in place; Dr Miriam Nyamai, a former DARA Big Data Policy Fellow who is currently working in Cape Town but originally from Kenya. However at the end of June the COVID19 rates in South Africa rocketed and President Ramaphosa requested that all non-essential travel not go ahead, meaning Miriam’s travel sadly had to be cancelled. Luckily Carringtone and Bonface were able to find a replacement just in time; Roger Odipo, a Cloud Solutions Developer based in Nairobi.  



To keep participants and staff members as safe as possible in the context of the COVID19 pandemic it was decided that the event should only have 16 people in attendance. Those 16 would be equally gender-balanced as it is clear from this report that much more needs to be done to encourage women into the data science arena. An invitation to apply for a place at the event was sent out across the DARA Big Data network and managed to get an incredible 400 applications! Clearly there is a huge amount of interest and enthusiasm in Nairobi for data science, which is perhaps unsurprising given that it is one of the top 3 Tech Hubs in Africa. Selecting just 16 to attend in person was no easy task, however it was decided that the first 2 days of the event would be livestreamed for all those who did not manage to get a place. 


The event was scheduled to take place in the last week of July. Plans were that the first day of the event would be presentations from academic and industry speakers, with the second day focussing on data science and machine learning tutorials. Following that the next 3 days would be a hackathon competition, with teams competing against each other for prizes. The hackathon was put together by Dr Nikhita Madhanpall, who works with DARA Big Data and the Office of Astronomy for Development (OAD) to implement a hackathon learning programme across AVN countries. Dr Madhanpall provided ample guidance to tutors on the hackathon project so that they were comfortable taking participants through it. Ahead of the event a lot of key details were put in place; an event logo was created, a photographer was booked, face masks and hand sanitiser were procured and Big Data Kenya t-shirts, notebooks and pens and team prizes were ordered. 



On Monday 26th July Big Data Kenya began with a welcoming address from Prof Baki, who asked everyone to introduce themselves to the group with an outline of why they wanted to attend. People were also watching on YouTube (the livestreams are captured here) where they were able to leave comments on the proceedings and discuss topics between themselves. Two of the presenters that afternoon were from Nairobi's prestigious Microsoft Africa Research Institute, known as MARI. Dr Kagonya Awori of MARI gave a fascinating presentation on AI and the 'dark patterns' used by many businesses, designed to confuse website visitors often to the advantage of the business. Dr Sam Chege Maina, also of MARI, gave an in-depth and useful presentation on the data science landscape in Africa and the varied work opportunities available on the continent. Daniel Laah Ayuba, who works in Nigeria at Solution Social Network, put together a detailed talk on the applications of data science in healthcare, in areas such as genetics, predictive analysis and drug discovery. Other speakers, such as Prof Anna Scaife of the University of Manchester and Dr Carolina Odman-Govender of the Inter-University Institute for Data Intensive Astronomy (IDIA), gave overviews of big data and data science in industry. 

On Tuesday the day started with OAD’s Dr Tawanda Chingozha talking about the use of science for development; he was later followed by Dr Ian Jones talking about the massive amounts of space data generated at Goonhilly Earth Station, where he is CEO. The 3 tutors then took turns giving carefully prepared tutorials throughout the day; topics included data preparation, deep learning, model evaluation, support vector machines and other key areas. Participants could ask questions throughout to ensure they understood everything fully. In the afternoon the TUK Vice-Chancellor, Prof Francis Aduol, made a surprise visit to the laboratory to speak to the attendees. Prof Aduol has a keen interest in Space Science and Technology and wanted to find out more about those who had signed up to attend and what they planned to do with their newly developed skills. The strong support of Prof Aduol and Prof Baki for the Big Data Kenya event and for the tutors is very much appreciated. 


Dr Nikhita Madhanpall began Wednesday morning by running through the guidelines for the hackathon, which would cover the final 3 days of the event. Participants were put in their teams and went through the hack tutorials, supported by the tutors. As with previous events the hack platform and associated support was provided by IDIA. The hackathon project was to analyse posts on Twitter with a #COVID19 hashtag using sentiment analysis; teams needed to learn how to use Neural Language Processing (NLP) and how to scrape Twitter for relevant data to use. Their final task, which they began on Thursday afternoon, was for teams to set themselves a research question and then answer this question using a dataset they had prepared, presenting findings to their peers and a panel of judges. All teams made excellent progress and were ready on time with their presentations.




Dr Sam Chege Maina of MARI very kindly returned to TUK on the final afternoon of Big Data Kenya to judge the presentations alongside tutors and Drs Odman-Govender and Madhanpall, who were watching online. 3 of the teams produced intriguing and thoughtful analyses of reaction to the COVID19 vaccine, looking at different areas of Africa and also internationally. The 4th team put together a presentation on the effects of COVID19 on mental health, which was an interesting angle and contained a lot of detail. The judges had a tough decision ahead of them! It took a lot of deliberating but eventually it was decided that Team Simba’s presentation on mental health and COVID19 was the winner, with Team TweetVader coming a close second with their vaccine analysis. Prizes were given to these teams and then there was just time for an Ask Me Anything session with a panel of experts before the event closed. 

Feedback from those in attendance after the event was extremely positive, with the three tutors receiving a lot of praise for their teaching methods and knowledgeability. One participant said ‘This hackathon opened my mind to unlimited opportunities. Data science skills are the new problem solvers in town.’ Dr Nikhita Madhanpall was particularly impressed by the enthusiasm of all involved, saying ‘The Big Data Kenya event was a huge success, largely due to the amazing tutors at TU Kenya who even offered an extra day of data science classes for participants to learn a little theory before seeing the techniques in action in the hackathon. Participants produced impressive results and presentations, despite some having limited prior experience in the field, and should be very proud.’ Participant Leonida Hawi said that she felt she was being exposed to areas that she had previously thought were beyond her reach; 'What an amazing opportunity it was to hear from people doing so much in their fields and who were giving us an opportunity to learn and to do the same. I am excited to be part of the brains that will help build and develop Kenya , Africa and the rest of the world in this age of the 4th Industrial Revolution.'

Bonface, Carringtone and Roger have now set up a data reduction pipeline on Github for reducing Twitter data using Tweet IDs, with an eye on potential future projects. You can have a look at it here.

The three Big Data Kenya tutors, from left to right; Carringtone Kinyanjui, Roger Odipo and Bonface Osoro


Dr Sam Chege Maina of MARI presents a member of Team TweetVader with their prize