Data Science| season one, lesson 4| what is a Data Scientist by @nova001 | 10% to steem.skillshare

in hive-197809 •  3 years ago 

What is a data scientist?

As a separate discipline, Data Science has emerged quite recently. It originated at the intersection of statistical analysis and data mining. The first issue of the Data Science Journal was published in 2002 under the leadership of the international council for science: the Committee on the Use of Data in Science and Technology. By 2008, data scientists had emerged and the industry was booming. Even though more institutions of higher education are training data scientists, there is still a shortage of them.

The data scientist's responsibilities include developing analysis strategies, preparing data for analysis, exploring, analyzing and visualizing data, developing data-driven models using programming languages ​​such as Python and R, and implementing models into applications.

images (8).jpeg
Source

The data scientist does not work alone. Efficient data mining requires a multidisciplinary team. In addition to the data scientist, it should include: a business analyst who defines the task; a data processor who is responsible for preparing and accessing the data; an IT systems architect who maintains the necessary processes and infrastructure; an application developer who embeds models or analysis results into applications and products.

Difficulties in implementing Data Science in companies

Despite the benefits that Data Science brings to businesses and the large amount of investment in this industry, not all companies manage to use their data to the maximum benefit for themselves. In the rush to hire the right people and develop data science programs, some companies have experienced inefficient team workflows with different people using different tools and incompatible processes. Stronger centralized leadership is needed to ensure a return on investment.

images (9).jpeg
Source

Its absence creates many problems.

Data scientists cannot work effectively. Since the IT administrator provides access to data and resources for analysis, researchers often wait for the data and resources they need. Once they have access to the data, researchers must analyze it using a variety of tools that are—often incompatible—with each other. For example, a model may be developed using the R language, but the application in which it is planned to be used is written in a different language. Therefore, the introduction of models into applications sometimes takes several weeks, or even months.

Application developers cannot use machine learning models directly.
It is not uncommon for application developers to receive learning models that are not ready for use in applications. The lack of flexibility prevents models from being applied in all required scenarios and forces application developers to make corrections.
images (11).jpeg
Source

IT administrators spend too much time on maintenance.
The number of open source tools is constantly growing, which means an increase in the burden on administrators. For example, marketing and financial data scientists may use very different tools. They also use different processes, i.e. administrators constantly have to make changes and additions to the infrastructure.

Business leaders are too far from the problems of Data Science.
Data Science processes are not always integrated into business processes and decision-making systems, and not all managers understand the specifics of this activity at the proper level. They find it difficult to understand why it takes so long to develop a prototype and put it into production, and the lack of quick results leads to a decrease in funding.
images (10).jpeg
Source

The end of lesson 4


link to my previous lessons

Lesson 1 : understanding data science

Lesson 2 : processes of data analysis in data science

Lesson 3 : Data Science tools and processes

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

Congratulations, your nice post has been upvoted by the steem.skillshare curation trail!
please check out this post:
steem.skillshare curation trail post to get infos about our trail

trail.jpg

As it have been described in this post, this user and @moodswings have been discovered as content farming and duplicated account. For that reason, these users have been blacklisted.


Team, #sevengers.

It is a mistake from my team member @zmoreno. I blacklisted the user and reported the case to @whitestallion last night. The vote has been removed with immediate effect.

  ·  3 years ago (edited)

Yes, it has to be removed with immediate effect. I am muting his post and also I will alert the Admin to mute him from the community as well. Thank you for this.

The vote has already been retired. Sorry for the inconvenience and thank you very much for bringing this case of abuse to the attention of the community. Regards.