The analysis of large data sets is becoming increasingly popular and important tool for acquiring new information in research and business. Google Trends show that tendency in time:

marcel1-rys2

Data Science is developing as well, using the Big Data’s growth, but with less interest in statistics software overall.

marcel1-rys1

This fact may have three reasons:

  • Many of the basic analysis could be done using basic programming languages, like Java.
  • Concrete solutions/programs are searched, not a problem in general.
  • People involved in the issue have their favourite programs, and they don’t seek out new ones.

R is one of the most popular programming language (the most in term of Google Trends Analysis) intended to statistics analysis, which is both a programming language. It, the only, reaches a growth in time.

marcel1-rys3

There are a lot of possibilities of learning R programming – especially with books, websites, statistical blogs and even forum (like Stackoverflow). In connection with the development of MOOC platforms, R courses in conjunction with the methods of analysis and data visualization are organized on very well known platforms such as edX or Coursera. Recently the new course, Data Busters (organized by Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Poland) was launched.

This entry is a comparison of Data Science Specialization offered by Coursera with Data Busters. It is worth noting that Data Busters started in early April 2015 and it is a new course, for which some issues are not yet clearly stated.

Coursera – Data Science Specialization
Data Busters
It has open form. Movies, texts, codes or presentations are available for downloading. It has more closed form. Presentations or text are on the web, nevertheless one could print website into PDF.
You must pay for the certificate. It is relatively expensive for all courses of Specialization. It provided free-signed certificate.
Assessment is made with quizzes with specified answers (for which you,could attempt three times) and open projects (mutually rated by users during peer assessment period). There are quizzes without specified answers – the result must meet the established form, however you can check what was the problem and try to fix it. There are no additional projects needed to solve.
One part takes about a month with a predetermined schedule. One part takes about a month and you can go it asynchronously. There is no predetermined plan.
From the beginning, you are aware of what part of the Specialization will be the next one. Perhaps due to the fact that the individual courses are already set and they are repeated monthly. For now, all materials suggest that there will be two parts, however there is a natural enlargement for more parts in the future.
Level could be established from basic to intermediate. For now, course is with basic level.
There is an access to the course’s and other forums or to materials on Github. There is an access to the course forum.
The certificate is obtained via the website with the opportunity to „show off” through LinkedIn service. The integration works well. The University of Warsaw signs diploma.
Registration is made by Coursera account and then one have an access to all information Registration is provided via social networking services, like Facebook or Google, without the need to make a new one.
The course extends the knowledge of many tools „associates” that make life easier in the data science subject, e.g. Shiny, Github, RPubs, R Markdown, Slidify and other packages. For now, this course focuses on methods of analysis and data processing and visualization.
In English In Polish and English
Very complex formula, sometimes in the negative sense of the word – the same information could be closed to approx. 60% of the courses and dedicated time. Very condensed formula encourages own research.
Presentations sometimes prepared in a chaotic manner, which takes into consideration the need for watching long movies. Doubtless, it depends also on the previous preparation to the topic. Form is highly ordered and clear.
The Specialization assumes that after it the knowledge is off, further improvements should be made by experience and another subjects provide at the end of the courses. The course encourages and refers to courses on the Coursera and edX platforms.
Access to course materials is only for about a week after the course and after this time only when they are downloaded to your computer. There is no information about the accessibility of course materials after the end of the course. They existed only as the web materials.

As you could see, the comparison seems to be difficult because these courses vary in length and approach to the subject. The one offered by Coursera is more extensive and older, while the Data Busters course begins freshly his adventure. For this reason some issues, which arose in comparison, would respond in the future. Let’s also hope that it will evolve in the right direction.

However, Data Busters seems to be an interesting alternative to Coursera. Especially, in the case of starting the adventure with data analysis and programming in R. It is very good trend of such courses, because they give an opportunity to expand the knowledge and to improve themselves „in competition”.

Leave a Reply