-
CARDIF Nanterre
- Freelancer
2014 - 2015
Big Data: Machine Learning
Development, Analysis, Administration, Architecture: Spark algos: Regression &
Innovating and assisting the Datascientists to develop their business Classification, Clustering and
Use Cases with Machine Learning by using the benefits provided by Recommendation
Big Data technologies such as Hadoop with Pig and Hive, and Spark
with DataFrames and its Machine Learning algorithms (e.g of UCs: Using Spark ML with both
Anti-Fraud detection, Churn, Appetence, Text analysis for automatic Dataframes and RDDs
classification, etc.) Scikit-Learn & Pandas
Recommendation system: Turi (GraphLab Create)
- Implementation of a Proof Of Concept consisting of a real time
recommendation system to recommend insurance products on new
clients's sales receipt when they go through the tils of hypermarkets Development (for Big Data and
- Spark Machine Learning Pipeline for learning in batch mode: Back Office)
Supervised learning with the Alternative Least Square of Spark-ML Languages: Java EE, Python
(matrix factorization algorithm, using the vector of users and the rank
parameter for dimensions reduction).Unsupervised learning with the K- IDEs: Eclipse, IntelliJ, Jupyter
Means algorithm of Spark-ML in order to find clusters of similarities in Notebook, Spark Notebook, Apache
term of purchase behavior Zeppelin
- Hadoop HDFS for storing logs coming from the tils in order to Linux Command Line and Shell
process later feature engineering and learning algorithms. Hadoop
YARN for Spark and Kafka clusters (for processing and computing in
batch mode and in real time mode) Architecture (for Big Data and
- Real time with Kafka and Spark Streaming for predictions Standard Systems)
(evaluations of sales receipts coming from the tils of supermarkets) Hadoop and its whole Ecosystem
- Spark Python programming. Developing a Python-Scala bridge in
order to improve the performances of UDF (User Defined Functions). Cloudera
Linux shell programming for packaging and deployment in production Java Application Servers & RDBMS
environment. Jupyter Notebook and Eclipse-PyDev IDE used as Applicative & Physical architectures
development environments
Langues
French * * * * *
PageRank: English * * * *
- Understanding of the PageRank algorithm and developing it with
Spark RDDs in order to make a pedagogical presentation of Graph
Processing for the Datascientists and to convince them to use Spark Interests
GraphX coupled with a Graph database like Neo4J
- Implementation of Proof Of Concept based on Spark RDDs (to
present the PageRank algorithm) and Spark GraphX (for PageRank, Autres
Connected Components, and Triangle Counting). Tehnos: Spark / Sports: Nature & Surfing (7"6)
Hadoop YARN, Hadoop HDFS, Python, Scala, Jupyter Notebook Cooking: Healthy food (spicy)
POC with Spark DataFrames (SQL) and Spark Machine Learning: Books: Science, Philosophy, Nature
- Implementation of Proof Of Concepts with Spark DataFrames and medicine
SQL, and Spark Machine Learning (by using objects like Transformer
and Estimator for Pipelines, Evaluator, CrossValidator) Movies: Fantastic, Humouristic,
- Developments by using ML algorithms like the Linear and Logistic Cartoons
Regressions, Random Forest, Neural Networks and ALS for
recommendations
- Presentation of these works to the DataScientists in order to explain
them how from their Jupyter environment they can develop both with
Python Scikit-Learn and Pandas and with Spark ML and DataFrames
- Technos: Spark ML, Spark DF and SQL, Jupyter Notebook, Python,
Pandas
Bench project - Pig Hive vs Spark:
- Implementation of a bench in order to compare performances
between Pig Hive and Spark SQL on a five nodes Hadoop V2.6
cluster, based on a Left- Outer Join query with tables containing retail
data
- Technos: Shel Linux programming for packaging and deployment of
the project on different Hadoop-Spark clusters, Spark DataFrames,
Python
Development of Csv2Hive:
- Development of an injector named Csv2Hive available on GitHub
https://github.com/enahwe/Csv2Hive
- This tool infers dynamicaly the schema from big CSV files containing
lot of columns; this tool enables quick automatic injections of external
data to feed Hive metastore and Hadoop HDFS.Technos: 95% of Linux
Shel scripting, 5% of Python
Miscelaneous tasks:
- Administration of a Cloudera 4 nodes cluster (Hadoop 2.3.0-
cdh5.0.2) for using mainly Pig and Hive
- Instalation and administration of a 5 nodes Cloudera cluster (Hadoop
2.6.0- cdh5.4.4), Sizing for each node: 4 cores 2.6 Ghz, 96 GB Mem,
1 TB Disk
- Instalation of Spark on YARN with Anaconda on each Hadoop node
- Configuration of Jupyter Notebook for Spark on YARN, to allow
DataScientists to discover Spark in Hadoop cluster
- In charge of feeding of business data towards the HadoopDataLake
(hence Csv2Hive)
- Developed MapReduce jobs in Java (e.g: inverted index)
Machine-Learning Challenge (Retail domain):
- Multi-categorization for CDiscount company ( chalenge on
https://www.datascience.net/fr/challenge/20/details ). Developed a
program with more than 500 Multinomial-NaiveBayes models,
Stemming, Stratified sampling and Mutual Information
Consultant at BULL - Missions for
CDISCOUNT
CDISCOUNT
-
Bordeaux (France)
- Consultant
2014 - 2014
-
Bordeaux (France)
- Consultant
2014 - 2014
Big Data implementations:
- Proofs Of Concepts based on Hadoop Cloudera distribution, with
recommenders based on Naïve Bayes (Python, Apache Mahout and
Java)
- Proofs Of Concepts based on Zookeeper, Kafka, Storm and
Cassandra
- Survey to combine M2M services into a Big Data infrastructure ;
- Machine Learning: Demos around Neural Networks for Classification
(supervised learning)
Consultant at BULL - Mission for CETE
CETE
-
Bordeaux (France)
- Consultant
2014 - 2014
Traffic jam web application:
- Architecture surveys around a web application implemented in Java
and JavaScript, dedicated to produce contents in real time about the
traffic jams, car crashes, roadwork's, etc
Consultant at BULL - Mission for VOYAGES-
SNCF
VOYAGES-SNCF
-
Nantes-Liles
- Consultant
2013 - 2013
Yield Management in the domain of train transportation:
- Audits for designing n-tiers architectures of existing applications
dedicated to the Yield Management business domain (optimizing the
prices for train tickets)
- List of technologies involved: AngularJS, JQuery with JQPlot for the
GUIs, web-services with JBoss AS server configured in High
Availability, Drools Engine Rules (BRMS) to compute automaticaly the
recurrent and simple business rules
Consultant at BULL - Architecture around
Banking Platform
BULL
-
Bordeaux (France)
- Consultant
2013 - 2013
Investigations around a supervision platform for banking terminals:
- Audits around new scenarios and new technologies in order to evolve
the existing platform and to boost the performances and scalability
Consultant at BULL - Mission for POLE-
EMPLOI
POLE-EMPLOI
-
Bordeaux (France)
- Consultant
2010 - 2013
Support and compliance of operational processes and action plans:
- Compliance of the project plans and operational processes to
successfully deliver in time the statistical business applications
- List of technologies: Java-EE, Customer's frameworks, ClearCase,
Maven, SonarQube, WebLogic, Oracle, SAS, Unix
Design and development of a SSO launcher for SAS Enterprise Guide
V4.3:
- Launcher deployed in all the agencies, providing an automatic
authentication for the end-users like the statisticians
Design and development (in Java) of a reliable integration chain for
SAS components running in Cobol and Unix:
- Solution similar to a continuous integration system, automatically
preparing each SAS component for a specific target environment
such as the production
Consultant at BULL - Pre-sale for CITY HALL
OF BRUSSELS
-
Bordeaux (France)
- Consultant
2010 - 2010
Quotation for a cal for tender successfuly won, consisting of the
creation of a Web Java EE Application for managing the European
historical heritages (method used: Use Case Points)
Consultant at BULL - Mission for POLE-
EMPLOI
POLE-EMPLOI
-
Bordeaux (France)
- Consultant
2010 - 2010
Audits around the Business and Project owners:
- Definition of a specification to start as soon as possible a new web
Java EE application in order to facilitate the search of jobs
Consultant at BULL - Mission for CETE
CETE
-
Bordeaux (France)
- Consultant
2009 - 2009
Designing new architectures:
- Specifications around Java ESB technologies in order to transfer the
business data in a secure way and reliable way (solution based on
Java OSGI and Apache Camel)
Consultant at BULL - Mission for MAAF
MAAF
-
Niort
- Consultant
2007 - 2008
Designing architectures:
- Opportunity surveys, audits, feasibility surveys, costings
Consultant at BULL - R&D European Project
Manager
ITEA2
-
Bordeaux (France)
- Consultant
2006 - 2007
French leadership and coordinator of a R&D European project caled
Usenet (ITEA2 consortium), in order to create a new European
standard in the domain of the M2M (Machine to Machine)
Consultant at BULL - ETL Projects & Pre-sale
BULL
-
Bordeaux (France)
- Architecture and Development
2006 - 2006
COMPLETEL-BOUYGUES:
- Design and development of a Java ETL application based on Talend,
in order to transfer monthly and automaticaly the invoices which come
directly from the clients
-
Bordeaux (France)
- Consultant
2006 - 2006
Architecture and Development around ETL (Talend) for
CNAMTS:
- Design and development of a Java ETL application based on Talend,
in order to import-export the data from the fleet management sources,
to SIEBEL and GLPI
-
Bordeaux (France)
- Consultant
2006 - 2006
Costings and technical surveys around solutions using GPS
localization in the domain of Fleet Management
- Investigations to make recommendations in order to evolve clientserver architectures to N-tier JEE architectures
- Investigations around access control systems concerning the time
management
Java-EE Expertise - Development and
Architecture
LECTRA
-
Bordeaux (France)
- Java-EE Expertise - Development and Architecture
2002 - 2006
Technical support in order to help the developer teams
Development around a Java ETL (based on Oracle Sunopsis) for realtime synchronization and bidirectional communication between
heterogeneous databases
Specifications and developments in Java around a web PDM
application (Product Data Management). List of technologies: Rational
Rapid Developer, WebSphere, Tomcat, Oracle RDBMS, tests with IBM
Workload Simulator
Development in order to instal automatically any Oracle 10G Database
in "silent" mode. List of technologies: InstalShield, Java, Ant
Many developments based on Java EJBs, JBoss, WebLogic and
WebSphere servers
Java Expertise - Development
BULL
-
Bordeaux (France)
- MQSeries Broker
1997 - 2002
Back-office Java developments for Call-Centers
Development in Java of a CTI server (Computer Telephony Interface)
for Alcatel-Lucent PABXs, above Genesys middle-ware, providing a
CTI integration with high availability up to hundreds connected
operators using a CRM application (Customer Relationship
Management)
Main technologies: Java, Siebel CRM, CTI Genesys, Oracle RDBMS,
MySQL RDBMS, SQL-Server RDBMS, MQSeries Broker, SWIFT,
LDAP Directory
Main customers: CNAMTS, MGEN GROUP CIC, GROUPAMA,
URSSAF, CNCA
Electronic Expertises
MATTHEWS SWEDOT, GAIA, IBM, SOULE
-
MATTHEWS SWEDOT
- Electronic technician
1995 - 1996
In the domain of industrial printer systems, maintenance, after-sales
and trainings
-
MATTHEWS SWEDOT
- Electronic Engineer & trainee
1994 - 1994
Electronic Engineer (trainee position) at GAIA:
- Design and creation of miniaturized power supplies based on high
frequencies
-
MATTHEWS SWEDOT
- Electronic technician
1993 - 1993
Maintenance and after-sales in the domain of the industrial printer
systems
1992 - Mainframe systems tester at IBM:
- Technician system tester for Mainframes IBM 3090 and ES9000
-
Bordeaux (France)
- Electronic Expertises
1990 - 1996
01/1995-08/1996 - Electronic technician at MATTHEWS SWEDOT:
- In the domain of industrial printer systems, maintenance, after-sales and trainings
06/1994-08/1994 - Electronic Engineer (trainee position) at GAIA:
- Design and creation of miniaturized power supplies based on high frequencies
01/1993-08/1993 - Electronic technician at MATTHEWS SWEDOT:
- Maintenance and after-sales in the domain of the industrial printer systems
1992 - Mainframe systems tester at IBM:
- Technician system tester for Mainframes IBM 3090 and ES9000
1990-1991 - Electronic technician at SOULE:
- Design, implementation and testing lightning for detection systems (customer: Electricite De France), based on Motorola microcontrollers
-
SOULE
- Electronic technician
1990 - 1991
Design, implementation and testing lightning for detection systems
(customer: Electricite De France), based on Motorola microcontrollers