Data Scientist és Machine Learning

Az egyik legfontosabb technológiai fejlődés az adatfeldolgozás terén történik napjainkban. Big Data, Data Mining, Machine Learning, Artificial Intelligence – olyan kulcsszavak, melyek ezt az utat jelölik meg. Készüljön fel ezek használatára képzéseik segítségével!

  • 1
Vissza

Data Scientist és Machine Learning

Data Scientist és Machine Learning
compendium-centrum-edukacyjne-logo.png
695 000 Ft
(Bruttó ár: 882 650 Ft)

Data Scientist átfogó képzés

Kód: DS100

This four-day workshop is created by our partner COMPENDIUM (PL). It covers data science and machine learning workflows at scale using Apache Spark 2 and other key components of the Hadoop ecosystem. The workshop emphasizes the use of data science and machine learning methods to address real-world business challenges. The workshop is designed for data scientists who currently use Python or R to work with smaller datasets on a single machine and who need to scale up their analyses and machine learning models to large datasets on distributed clusters. Data engineers and developers with some knowledge of data science and machine learning may also find this workshop useful. Using scenarios and datasets from a fictional technology company, students discover insights to support critical business decisions and develop data products to transform the business. The material is presented through a sequence of brief lectures, interactive demonstrations, extensive hands-on exercises, and discussions. The Apache Spark demonstrations and exercises are conducted in Python (with PySpark) and R (with sparklyr) using the Cloudera Data Science Workbench (CDSW) environment. Participants gain practical skills and hands-on experience with data science tools including: Spark, Spark SQL, and Spark MLlib PySpark and sparklyr Cloudera Data Science Workbench (CDSW) Hue The workshop includes brief lectures, interactive demonstrations, hands-on exercises, and discussions covering topics including: Overview of data science and machine learning at scale Overview of the Hadoop ecosystem Working with HDFS data and Hive tables using Hue Introduction to Cloudera Data Science Workbench Overview of Apache Spark 2 Reading and writing data Inspecting data quality Cleansing and transforming data Summarizing and grouping data Combining, splitting, and reshaping data Exploring data Configuring, monitoring, and troubleshooting Spark applications Overview of machine learning in Spark MLlib Extracting, transforming, and selecting features Building and evaluating regression models Building and evaluating classification models Building and evaluating clustering models Cross-validating models and tuning hyperparameters Building machine learning pipelines Deploying machine learning models Workshop participants should have a basic understanding of Python or R and some experience exploring and analyzing data and developing statistical or machine learning models. Knowledge of Hadoop or Spark is not required.  

Időtartam:4 nap
Nehézségi szint:
  • Haladó
Big-Data-Analytics-Companies.jpg
33 500 Ft
(Bruttó ár: 42 545 Ft)

Big Data and Analytics for Business Users

Kód: WA2186

Data is one of the most valuable assets that your organization possesses.  Every day you are creating more data and potentially passing up opportunities to harvest that data and use it to accelerate the achievement of your organization’s strategic objectives.  Big Data and Analytics represent an emerging trend around harvesting, analyzing, and capitalizing on the wealth of data that is within the grasp of your enterprise. “For every 100 open Big Data jobs, there are only two qualified candidates” - fastcompany.com This one day primer introduces Cloud Computing, Big Data, and the emerging discipline of Data Analytics.  Attention will be given to the three V’s of Big Data: Volume, Velocity, and Variety as well as the fourth V of Value.  You’ll learn about these critical elements and the powerful value proposition that these capabilities provide.  What are the processes, tools, and personnel that will be needed in order to take advantage of this sea change in information management?  This essential course will equip you to understand your customers better and how to deliver more value today.   This product is delivered as a voucher. After ordering, the voucher(s) will be available in your Dashboard (myLeapest). Content The following contents are included with this product. For any questions about these contents, please, contact the Seller of this product. Big Data and Analytics for Business Users - eBook-English-(en-US) Topics: Cloud Computing Basics Introduction to Big Data Understanding Data Analytics Understanding Predictive Analytics Basics of Analytical Modeling Unpacking the Value, Volume, Velocity, and Variety Organizational Considerations Recommended Next Steps   Managers, Analysts, Architects, and Team Leads   nincs   Chapter 1: Defining Big Data Outline of Big Data and Analytics for Business Users Training: In-Class Discussion Gartner's Definition of Big Data More Definitions of Big Data Transforming Data into Business Information Challenges Posed by Big Data Processing Big Data Apache Hadoop The Cloud and Big Data The CAP Theorem Summary   Chapter 2. Hadoop Overview The Client – Server Processing Pattern Apache Hadoop Apache Hadoop LogoTypical Hadoop Applications Hadoop Clusters Hadoop Distributions Hadoop's Main Components HDFSHDFS BlocksYARN Hadoop-based Systems for Data Analysis MapReduceSimilarity with SQL Aggregation Operations Distributed Computing Economics Discussion: Divide and Conquer Apache PigPig Latin Running PigPig Latin Script Example: What is Hive? Hive's Value Proposition Who uses Hive? What Hive Does Not Have HiveQLWorking with Hive Tables What is HBase? HBase vs RDBS Interfacing with HBase HBase Table Design DigestA Cell's Value Versioning Creating and Populating a Table in HBase Shell Getting a Cell's Value Counting Rows in an HBase Table Summary   Chapter 3. Big Data Analytics in the Cloud Data is King Big Data Stores in the Cloud Example: AWS Simple Storage Service (S3) MapReduce (and Hadoop) in the CloudInformation and Data Security Data-at-rest Security Examples Example of Object Encryption in S3 One S3 Use Case: Backup and Archiving Data Analytics Services in the Cloud Analytics Services with AWS AWS EMR: Software Configuration Screen AWS EMR: Hardware Configuration Screen Big Data Analytics Solutions from Google Cloud Google Data Processing and Analytics Pipelines Google BigQuery Machine Learning Microsoft Azure ML Studio Machine Learning Pipeline Summary   Chapter 4. Making Big Data Small Techniques What is Data Science?Data Science, Machine Learning, AI? Making Big Data Small Descriptive Statistics Correlation Reducing the Number of Data Attributes Lasso Regularization Sampling Examples Data Compression Summary   Chapter 5. Introduction to Apache Spark What is Apache Spark Where to Get Spark? The Spark Platform Spark Logo Common Spark Use Cases Running Spark on a Cluster The Driver Process Spark Shell Interfaces with Data Storage Systems Limitations of Hadoop's MapReduce Spark vs MapReduce The Resilient Distributed Dataset (RDD) Spark Streaming (Micro-batching) Spark SQLExample of Spark SQL Spark Machine Learning Library Example: Using Random Forests with Spark MLlib The Output (the “Confusion” matrix) Dumping the Trained Model Clustering Finding Centroids Example: Using kMeans Module with Spark MLlib Printing the Centroids GraphX Summary

Időtartam:1 nap
Nehézségi szint:
  • Kezdő
apache-spark.jpg
19 500 Ft
(Bruttó ár: 24 765 Ft)

Machine Learning with Apache Spark Foundation

Kód: WA2610

To stay competitive, organizations have started adopting new approaches to data processing and analysis. For example, data scientists are turning to Apache Spark for processing massive amounts of data using Apache Spark's distributed compute capability and its built-in machine learning library.This intensive Apache Spark training course provides an overview of data science algorithms as well as the theoretical and technical aspects of using the Apache Spark platform for Machine Learning. This training course is supplemented by a variety of hands-on labs that help attendees reinforce their theoretical knowledge of the learned material.This product is delivered as a voucher. After ordering, the voucher(s) will be available in your Dashboard (myLeapest). You will also receive required software set up for install 48 hours from time of purchase. Termékleírás The following contents are included with this product. For any questions about these contents, please, contact the Seller of this product. Machine Learning with Apache Spark - Lecture Ebook - eBook-English-(en-US) Machine Learning with Apache Spark - Lab Guide   Data Scientists, Business Analysts, Software Developers, IT Architects   Participants should have the general knowledge of statistics and programming   Course Outline Chapter 1. Machine Learning Algorithms Supervised vs Unsupervised Machine Learning Supervised Machine Learning Algorithms Unsupervised Machine Learning Algorithms Choose the Right Algorithm Life-cycles of Machine Learning Development Classifying with k-Nearest Neighbors (SL)k-Nearest Neighbors Algorithmk-Nearest Neighbors Algorithm The Error Rate Decision Trees (SL)Random Forests Unsupervised Learning Type: ClusteringK-Means Clustering (UL)K-Means Clustering in a Nutshell Regression Analysis Logistic Regression Summary   Chapter 2. Introduction to Functional Programming What is Functional Programming (FP)? Terminology: Higher-Order Functions Terminology: Lambda vs Closure A Short List of Languages that Support FPFP with JavaFP With JavaScript Imperative Programming in JavaScript The JavaScript map (FP) Example The JavaScript reduce (FP) Example Using reduce to Flatten an Array of Arrays (FP) Example The JavaScript filter (FP) Example Common High-Order Functions in Python Common High-Order Functions in Scala Elements of FP in R Summary   Chapter 3. Introduction to Apache Spark What is Apache Spark A Short History of Spark Where to Get Spark?The Spark Platform Spark Logo Common Spark Use Cases Languages Supported by Spark Running Spark on a Cluster The Driver Process Spark Applications Spark Shell The spark-submit Tool The spark-submit Tool Configuration The Executor and Worker Processes The Spark Application Architecture Interfaces with Data Storage Systems Limitations of Hadoop's MapReduce Spark vs MapReduce Spark as an Alternative to Apache Tez The Resilient Distributed Dataset (RDD) Spark Streaming (Micro-batching)Spark SQL Example of Spark SQLSpark Machine Learning Library GraphXSpark vs R Summary   Chapter 4. The Spark Shell The Spark Shell UI Spark Shell Options Getting Help The Spark Context (sc) and SQL Context (sqlContext) The Shell Spark Context Loading Files Saving Files Basic Spark ETL Operations Summary   Chapter 5. Spark Machine Learning Library What is MLlib? Supported Languages MLlib Packages Dense and Sparse Vectors Labeled Point Python Example of Using the Labeled Point Class LIBSVM format An Example of a LIBSVM File Loading LIBSVM Files Local Matrices Example of Creating Matrices in MLlib Distributed Matrices Example of Using a Distributed Matrix Classification and Regression Algorithm Clustering Summary   Chapter 6. Text Mining What is Text Mining? The Common Text Mining Tasks What is Natural Language Processing (NLP)? Some of the NLP Use Cases Machine Learning in Text Mining and NLP Machine Learning in NLPTF-IDF The Feature Hashing Trick Stemming Example of Stemming Stop Words Popular Text Mining and NLP Libraries and Packages Summary Lab Exercises Lab 1. Learning the Lab Environment Lab 2. The Spark Shell Lab 3. Using Random Forests for Classification with Spark MLlib Lab 4. Using k-means Algorithm from MLlib Lab 5. Text Classification with Spark ML Pipeline   Target Audience Data Scientists, Business Analysts, Software Developers, IT Architects   Course Agenda Applied Data Science and Business Analytics Machine Learning Algorithms, Techniques and Common Analytical Methods Apache Spark Introduction Spark’s MLlib Machine Learning Library   This Apache Spark training course has 3 hands-on labs that are outlined at the bottom of this page. The labs cover the spark-submit tool as well as Apache Spark shell. The labs allow you to practice the following skills: Lab 1 - Using the spark-submit ToolSpark offers developers two ways of running your applications:Using the spark-submit toolUsing Spark ShellIn this lab, we will review what is involved in using the spark-submit tool.   Lab 2 - The Apache Spark ShellInteractive development environment in Spark is provided by the Spark Shell (also known as REPL: Read/Eval/Print Loop tool) that is available for Scala and Python developers (Java is not yet supported).The lab instructions below apply to the Scala version of the Spark Shell.   Lab 3 - Using Random Forests for Classification with Spark MLlibIn this lab, we will learn how to use Random Forests implementation of the algorithm from Spark's Machine Learning library, MLlib, to perform object classification.Random Forests algorithm is regarded as one of the most successful supervised learning algorithm that can be used for both classification and regression. In our work we will use the Python version of the library, which provides API similar to those implemented in Scala and Java.We will also use the spark-submit Spark tool to submit the application from command line rather than typing in commands in Spark Shell.

Időtartam:1 nap
Nehézségi szint:
  • Kezdő

Hírlevél feliratkozás

Az Adatvédelmi szabályzatot megértettem és elfogadom, feliratkozom a Számalk hírlevelére.

Tanfolyami naptár