Hi,guys I recently attended a EMC certification event about data science and big data analytics conducted by ICT academy. I've totally got a new idea about the emerging trends in Big Data.The certification which was provided by the EMCDSA (EMC data Scientist Associate Professional)
Nowadays, Big data is the most trending technology and more valuable platform to a data scientist. Even a Fresher could earns an income of 17L /yr packages. If a software engineer cannot survive in the world but a data scientist we can rule over the world. Let's we see,
Nowadays, Big data is the most trending technology and more valuable platform to a data scientist. Even a Fresher could earns an income of 17L /yr packages. If a software engineer cannot survive in the world but a data scientist we can rule over the world. Let's we see,
Big Data is data whose scale,distribution,diversity,and timeliness require the use of new technical architectures and analytics to enable insights that unlock new sources of business value.Those the three golden concepts satisfies the big data,
- Volume
- Velocity
- Variety
Most of people have a question about why facebook store their big data?
You see in the most pages, likewise "you may like this page, you may like this post, you have a suggestion from your friend."
Facebook analyse their sequence of individual user and verifies more number analytical process they need more number of data,so they gathers a individual persons data.The individual data should be a useful or useless but the organization stores the whole data.
At first, facebook was started with the whole data of text messages,then it would get frequent updates to improve their performance.Now currently it works with the machine learning algorithm millions of data in a every second.
If we consider in the Amazon prime is one of the streaming web series that consumes lot of data.The company retrieves the information of the web searches of a individual users and make it widely to enhance their whole platform.They officially personalized their whole data of a every user.
Big data also must proceed the analyze of profit and loss of particular year with a particular technology
Here we intensely go to the big data concept, first we needs to be study below,
KEY CHARACTERISTICS OF BIG DATA
- Data Volume : Total Storage of the data in a container
- Processing Complexity : The performance of each dataset that required to be easy and quick
- Data Structure : Easy structure of data can retreives the information properly and convenient usage
DATA STRUCTURE GROWTH
Growth of DataStructure |
Structured: The data can be aligned in a specified manner.Any data scientist can retrieves the particular data for the official use.
Semi-Structured: The data should be aligned in JSON(Java script object notation) , XML(Extended Markup Language), and NoSQL databases. Total data can be configured in a specified file formats.
Quasi-Structured: The Structured data would be blindly used for the google search engine. It can be searches under the partial information of the query keyword.
Un-Structutred: It can be expressed as collections of all combinations of text,video,audio,etc.... . Any type of data format can be stored on the structure.
The Data Structure can be two more simple examples,and it can be shown as
The Structured query URL like be
https://navybird.blogspot.com/big-data-analytics-data-science/
The Unstructured URL query like be
https://www.blogger.com/blogger.g?blogID=3920101252358251749/img?#4322
DATA REPOSITORIES
Data Islands
Data Ware House
- collections of arranged data that contains number of libraries followed by a chain by chain formation. All the data sets are stored in one place.
Analytic Sandbox
Analytic Sandbox
BUSINESS DRIVERS:
- Desire to optimize
- Desire to Identify Business risk
- Predict New Business Opportunities
- Comply with laws or regulatory requirements
To optimize their sales & profitability
To reduce the fraud and customer churn
To enhace their Cross sell prospects
- The laws can be modulated every year, it existed to the additional complexity and data requirements for the organization.The laws can be mainly concentrated to the Anti-Money Laundering
BUSINESS INTELLIGENCE
The Business Intelligence mainly focuses on using a consistent set of metrics to measure the past performance and inform business planning. It contains the analytics of past and present data sets.
- It requires Traditional sources, Structured Data and manageable data sets.
- Its based on the Standard and Ad-Hoc reporing
- What happened last quarter?
- How many did we sell?
- Where is the problems? In which situations?
DATA SCIENCE
It collects the tones of stored data that would be mentioned on the past & present to predicts the future data. A data science always contains the Business Intelligence
It requires very large data sets.
- Its a technique of Structured (or) Unstructured Data
- It gives optimization, forecasting , Predictive modeling and statistical analysis.
- What if...?
- What's the optimal scenerio for our business?
- What will happen next?What if these trends continue?
- Why this is happening?
The Big data must solve its issue by their own historical data.The data must belongs to the past, present data to predict the future data and it accurately calculates their values,it must be profit or loss.
No comments:
Post a Comment