MALAYSIAN ITALY DESIGN INSTITUTE (MIDI)
ENTERPRISE RESOURCE PLANNING
ASSIGNMENT: BIG DATA ANALYTICS
AIMAN AFIQ BIN MOHD ROSLI
58215117103
DR PUNNOSE P KOOVOR
What Is Big Data Analytics
Big data analytics is that the use of advanced analytic techniques against very
large, diverse data sets that include structured, semi-structured and unstructured
data, from different sources, and in numerous sizes from terabytes to zettabytes.
Big data may be a term applied to data sets whose size or type is beyond the
flexibility of traditional relational databases to capture, manage and process the
information with low latency. Big data has one or more of the
subsequent characteristics: high volume, high velocity or high variety. computer
science (AI), mobile, social and therefore the Internet of Things (IoT) are driving data
complexity through new forms and sources of information. for instance, big data
comes from sensors, devices, video/audio, networks, log files, transactional
applications, web, and social media (much of it generated in real time and at a
awfully large scale).
Analysis of massive data allows analysts, researchers and business users to
form better and faster decisions using data that was previously inaccessible or
unusable. Businesses can use advanced analytics techniques like text analytics,
machine learning, predictive analytics, data processing, statistics
and language processing to realize new insights from previously untapped data
sources independently or along with existing enterprise data.
Role of massive Data Analytics in ERP System
A large amount of information is being generated each day from the
highest sources like web channels, IoT, companies’ servers, media etc. ERP
software solutions gather plenty of enterprise data comprising HR, finance, CRM and
other essential business functions of a business. Big data analytics tools and ERP
system when brought together have the potential to unfold valuable insights which
will help businesses make smarter decisions. In fact, big data analytics in ERP has a
vital role to play in enhancing ERP capabilities and obtain most out of the ERP
system.
Moreover, it helps add up out of unused information and also accelerates the
entire decision-making process. the entire integration is a vital step to travel forward
and lead during this competitive cut-throat business environment. Because
making the proper decisions quickly is extremely essential to survive during
this competitive business market furthermore as smooth business functioning.
Many ERP systems fail to form use of real-time inventory and provide chains
data. This happens mainly because these systems lack the intelligence to
form predictions about products demands. during this situation, big data tools can
predict demand and help to see what your company must go.
ERP systems manage all the business data and provides useful insights into
the demand and provide equation. However, Big Data analytics
can transcend. they'll forecast better about the strain and future needs. Not only data
analytics tools help to see the present consumer demands and behaviors but also
predict how a product goes to perform within the future. This insight can only be seen
by integrating both Big Data analytics and Enterprise Resource Planning
systems within the business.
Big data analytics in ERP applications can provide faster information delivery
and reliable forecasting using tools like Hadoop. Hadoop may be a free distributed
processing framework and is understood to be the most effective tool for analyzing
big data. To conclude, Big Data and Enterprise Resource Planning (ERP) combo is
currently being adopted to induce the foremost of the ERP data to form business
decisions better and efficient.
Big Data Analytic Tools
Tableau Public
What is Tableau Public:
It's an easy and intuitive tool. because it offers intriguing insights through data
visualization. Tableau Public’s million-row limit. As it’s easy to use fares better than
most of the opposite players within the data analytics market. With Tableau’s
visuals, you'll be able to investigate a hypothesis. Also, explore the information, and
cross-check your insights.
Uses of Tableau Public:
• You can publish interactive data visualizations to the net at no cost.
• No programming skills required.
Visualizations published to Tableau Public may be embedded into blogs. Also, web
content and be shared through email or social media. The shared content may
be made available for downloads. This makes it the most effective Big Data Analytics
tools.
Limitations of Tableau Public
• All data is public and offers little scope for restricted access
• Data size limitation
• Cannot be connected to R.
• The only thanks to read is via OData sources, is Excel or txt
OpenRefine
What is OpenRefine:
Formerly called GoogleRefine, the information cleaning software. because it helps
you shut down data for analysis. It operates on a row of knowledge. Also, have cells
under columns, quite like electronic database tables.
Uses of OpenRefine:
• Cleaning messy data
• Transformation of knowledge
• Parsing data from websites
Adding data to the dataset by fetching it from web services. as an example,
OpenRefine might be used for geocoding addresses to geographic coordinates.
Limitations of OpenRefine:
• Open Refine is unsuitable for giant datasets.
• Refine doesn't work alright with big data.
KNIME
What is KNIME:
KNIME helps you to govern, analyze, and model data through visual
programming. it's accustomed integrate various components for data processing and
machine learning.
Uses of KNIME:
• Don’t write blocks of code. Rather, you've got to drop and drag connection points
between activities.
• This data analysis tool supports programming languages.
In facts, analysis tools like these may be extended to run chemistry data, text mining,
python and R.
Limitation of KNIME:
• Poor data visualization
RapidMiner
What is RapidMiner:
RapidMiner provides machine learning procedures. And processing including data
visualization, processing, statistical modeling and predictive analytics. RapidMiner
written in Java is fast gaining acceptance as an unlimited data analytics tool.
Uses of RapidMiner:
• It provides an integrated environment for business analytics, predictive analysis.
• Along with commercial and business applications, it is also used for application
development.
Limitations of RapidMiner:
• RapidMiner has size constraints with relevance the quantity of rows.
• For RapidMiner, you'd like more hardware resources than ODM and SAS.
Google Fusion Tables
What is Google Fusion Table:
When involves data tools, we've a cooler, larger version of Google Spreadsheets. an
out of this world tool for data analysis, mapping, and large dataset visualization. Also,
Google Fusion Tables is added to business analytics tools list. this may be also
one all told the foremost effective Big Data Analytics tools.
Uses of Google Fusion Tables:
• Visualize bigger table data online.
• Filter and summarize across many thousands of rows.
• Combine tables with other data on the net
• You can merge two or three tables to induce one visualization that has sets of
information.
• You can create a map in minutes.
Limitations of Google Fusion Tables:
• Only the first 100,000 rows of information in an exceedingly table are included in
query results or mapped.
• The total size of the information sent in one API call can't be quite 1MB.
NodeXL
What is NodeXL:
It is a visualization and analysis software of relationships and networks. NodeXL
provides exact calculations. it is a free (not the professional one) and open-source
network analysis and visualization software. NodeXL is one all told the
foremost effective statistical tools for data analysis. within which has advanced
network metrics. Also, access to social media network data importers, and
automation.
Uses of NodeXL:
This is one all told the information analysis tools in Excel that helps within the
subsequent areas:
• Data Import
• Graph Visualization
• Graph Analysis
• Data Representation
This software integrates into Microsoft Excel 2007, 2010, 2013, and 2016. It opens
as a workbook with a variety of worksheets containing the weather of a graph
structure. that's like nodes and edges. This software can import various graph
formats. Such adjacency matrices, Pajek .net, UCINet .dl, GraphML, and edge lists.
Limitations of NodeXL:
• You must use multiple seeding terms for a selected problem.
• Running the information extractions at slightly different times.
Wolfram Alpha
What is Wolfram Alpha:
It is a computational knowledge engine or answering engine founded by Stephen
Wolfram.
Uses of Wolfram Alpha:
• Is an add-on for Apple’s Siri
• Provides detailed responses to technical searches and solves calculus problems.
• Helps business users with information charts and graphs. And helps in creating
topic overviews, commodity information, and high-level pricing history.
Limitations of Wolfram Alpha:
• Wolfram Alpha can only cater to a publicly known number and facts, not with
viewpoints.
• It limits the computation time for each query.
Google Search Operators
What is Google Search Operators:
It is a sturdy resource which helps you filter Google results. That instantly to urge
most relevant and useful information.
Uses of Google Search Operators:
• Faster filtering of Google search results
• Google’s powerful data analysis tool can help discover new information.
Solver
What is Excel Solver:
The Solver Add-in is also a Microsoft Office Excel add-in program. Also, it's available
after you put in Microsoft Excel or Office. it is a applied math and optimization tool in
excel. this allows you to line constraints. it is a complicated optimization tool that
helps in quick problem-solving.
Uses of Solver:
• The final values found by Solver are a solution to interrelation and decision.
• It uses a variety of methods, from nonlinear optimization. And also applied math to
evolutionary and genetic algorithms, to hunt out solutions.
Limitations of Solver:
• Poor scaling is one all told the areas where Excel Solver lacks.
• It can affect solution time and quality.
• Solver affects the intrinsic solvability of your model.
Dataiku DSS
What is Dataiku DSS:
This is a collaborative data science software platform. Also, it helps a team build,
prototype, explore. Although, it deliver their own data products more efficiently.
Uses of Dataiku DSS:
Dataiku DSS– Data analytic tools provide an interactive visual interface. As during
this they'll build, click, and point or use languages like SQL.
Limitation of Dataiku DSS:
• Limited visualization capabilities
• UI hurdles: Reloading of code/datasets
• Inability to easily compile entire code into one document/notebook
• Still, must integrate with SPARK
The Importance of big Data Analytics
Cost reduction.
Big data technologies like Hadoop and cloud-based analytics bring significant cost
advantages when it involves storing large amounts of information – plus they'll
identify more efficient ways of doing business.
Faster, better deciding.
With the speed of Hadoop and in-memory analytics, combined with the ability to
research new sources of information, businesses are able to analyze information
immediately – and make decisions supported what they’ve learned.
New products and services.
With the ability to live customer needs and satisfaction through analytics comes the
power to permit customers what they have. Davenport pints out that with big data
analytics, more companies are creating new products to meet customers’ needs.
4 Types of Big Data Analytics
Descriptive Analytics
This technique is that the most time-intensive and infrequently produces the smallest
amount value; however, it's useful for uncovering patterns within a
particular segment of consumers. Descriptive analytics provide insight into what is
going on historically and can provide you with trends to poke into in
additional detail. samples of descriptive analytics include summary statistics,
clustering and association rules employed in market basket analysis. Key points:
• Backward looking
• Focused on descriptions and comparisons
• Pattern detection and descriptions
• MECE (mutually exclusive and collectively exhaustive) categorization
• Category development supported similarities and differences (segmentation)
Diagnostic Analytics
Data scientists intercommunicate this system when trying to see why something
happened. it's useful when researching leading churn indicators and usage trends
amongst your most loyal customers. samples of diagnostic analytics include churn
reason analysis and customer health score analysis. Key points:
• Backward looking
• Focused on causal relationships and sequences
• Relative ranking of dimensions/variable supported inferred explanatory power)
• Target/dependent variable with independent variables/dimensions
• Includes both frequentist and Bayesian causal inferential analyses
Predictive Analytics
The most commonly used technique; predictive analytics use models to forecast
what might happen in specific scenarios. samples of predictive analytics include next
best offers, churn risk and renewal risk analysis.
• Forward looking
• Focused on non-discrete predictions of future states, relationship, and patterns
• Description of prediction result set probability distributions and likelihoods
• Model application
• Non-discrete forecasting (forecasts communicated in probability distributions)
Prescriptive Analytics
The most valuable and most underused big data analytics technique, prescriptive
analytics gives you a laser-like focus to answer a particular question. It helps to
see the simplest solution among a range of choices, given the known parameters
and suggests options for the way to require advantage of a future opportunity or
mitigate a future risk. It also can illustrate the implications of every decision to
enhance decision-making. samples of prescriptive analytics for customer retention
include next best action and next best offer analysis.
• Forward looking
• Focused on optimal decisions for future situations
• Simple rules to complex models that are applied on an automatic or programmatic
basis
• Discrete prediction of individual data set members supported similarities and
differences
• Optimization and decision rules for future events