Greg Makowski

Los Altos, California, United States Contact Info
7K followers 500+ connections

Join to view profile

About

Develop data science enterprise solutions with a motivation to deeply understand the…

Articles by Greg

Activity

Join now to see all activity

Experience & Education

  • Pivot-XY

View Greg’s full experience

See their title, tenure and more.

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Volunteer Experience

  • Vice Chair, Chair, Chair of Data Science SIG

    San Francisco Bay Area ACM

    - Present 16 years

    Education

    We are a local chapter of the Association of Computing Machinery, www.ACM.org. We have been running continuously since 1957 - older than many meetup groups. We host about 25 events per year. Each month, we have two evening talks: General Computing and Data Science SIG. Twice a year, we hold Professional Development Seminars (PDS) on a Saturday. We also support local science fairs.

    * Organized Data Mining / Data Science Camp (unconference) annually since 2009

    We are a local chapter of the Association of Computing Machinery, www.ACM.org. We have been running continuously since 1957 - older than many meetup groups. We host about 25 events per year. Each month, we have two evening talks: General Computing and Data Science SIG. Twice a year, we hold Professional Development Seminars (PDS) on a Saturday. We also support local science fairs.

    * Organized Data Mining / Data Science Camp (unconference) annually since 2009
    http://www.sfbayacm.org/data-science-camp-2017/ (most recent)
    * Organized the first big data Kaggle competition, with 2 years of Best Buy data https://www.kaggle.com/c/acm-sf-chapter-hackathon-big
    https://www.kaggle.com/c/acm-sf-chapter-hackathon-small
    * Organized a number of day long Professional Development Seminars (PDS): Cloud, R, Python, ..
    * Organized the Toyota Car Hackathon, 2013 http://events.sfbayacm.org/event/toyota-itc-acm-quantified-car-hackathon/

    www.SFbayACM.org
    http://www.meetup.com/SF-Bay-ACM/
    https://www.youtube.com/user/sfbayacm (125+ past talks on our channel)

  • The Leukemia & Lymphoma Society Graphic

    Fundraiser & marathon runner

    The Leukemia & Lymphoma Society

    - 1 year

    Health

    I ran one marathon each year, organizing fundraiser events and raising $8,000 in total.

Patents

  • Event Lift Forecasting - automated forecasting for retail promotion events

    Filed US don't remember

    http://www.sfbayacm.org/event/dmsig-embedded-automatic-model-training-and-forecasting-enterprise-software-application-mar-11
    (Slides and presentation event)

    http://fora.tv/2009/03/11/Greg_Makowski_Event_Lift_Forecasting
    (Video of presentation)

    Other inventors

Projects

  • Introduction to Data Mining Algorithms and their Evolution toward IoT

    Invited speaker for the Silicon Valley Business of Engineering Social Meetup. This was outside of work.

    See project
  • Global Big Data Conference 6/9/2017: "Predictive Model and Record Description Using Segmented Sensitivity Analysis"

    Model description, both at the overall level on what variables are most important to the forecast, and reasons for an individual record forecast can be a competitive advantage in a business or consulting situation. Design objectives for this model description include 1) describe the model in terms of variables understandable to the target audience 2) independent of the predictive algorithm, 3) support a single or ensemble of models, 4) pick up non-linearities in variables and 5) pick up…

    Model description, both at the overall level on what variables are most important to the forecast, and reasons for an individual record forecast can be a competitive advantage in a business or consulting situation. Design objectives for this model description include 1) describe the model in terms of variables understandable to the target audience 2) independent of the predictive algorithm, 3) support a single or ensemble of models, 4) pick up non-linearities in variables and 5) pick up interaction effects between variables. The Segmented Sensitivity Analysis (SSA) method has been used by the author for 25 years in a variety of data mining projects. As with other descriptive techniques, there is a bit of art when developing a solution for a specific use case. A number of SSA variations and use cases will be discussed to illustrate how the system can be adapted. The SSA model description can also be helpful during the model building process, as well as detecting and describing model drift over time, as the behavior of the scoring data slowly changes or drifts from the training data.

    R code to share is under development.

    See project
  • Invited speaker to Beijing at Global Mobile Internet Conference (GMIC): Tutorial #6, Predictive Data Science in R

    I was an invited speaker, with travel expenses paid to Beijing.

    I gave a 4 hour tutorial, which was later expanded to an 8 hour course. For details, see the related project titled <Develop and teach 8 hour course "Predictive Data Science in R">

    See project
  • Meetup: Production Model Lifecycle Management

    Data Scientists are quite motivated to develop accurate predictive models and do testing for generalization. Providing a decent description can be an afterthought. Talking to a number of modelers, I have heard them state "if I have to describe my model, then I will just skip certain algorithms that are hard to describe." The thesis of the presentation is that you can maximize all 3 objectives 1) accuracy, 2) generalization and 3) understandability.

    As a best practice, first only…

    Data Scientists are quite motivated to develop accurate predictive models and do testing for generalization. Providing a decent description can be an afterthought. Talking to a number of modelers, I have heard them state "if I have to describe my model, then I will just skip certain algorithms that are hard to describe." The thesis of the presentation is that you can maximize all 3 objectives 1) accuracy, 2) generalization and 3) understandability.

    As a best practice, first only focus on a metric combining accuracy and generalization, track the design of experiments of model parameters in a model notebook, reporting estimates of the business value per model.

    To describe the model globally, sensitivity analysis or LIME (Local Interpretable Model-agnostic Explanations) can be used.

    The model lifecycle does not just end with putting the model into production. You can use metrics to track the gradual data drift over time, as the model is repeatedly deployed. This tracking leverages the detailed model description analysis.

    See project
  • Deep Learning Meetup: Using Deep Learning to do Real-Time Scoring in Practical Applications

    The talk will cover a brief review of neural network basics and the following types of neural network deep learning:
    * autocorrelational - unsupervised learning for extracting features. Describe how additional layers build complexity in the feature extraction.
    * convolutional - how to detect shift invariant patterns in various data sources. Horizontal shift invariant detection applies to signals like speech recognition or IoT data. Horizontal and vertical shift invariance applies to…

    The talk will cover a brief review of neural network basics and the following types of neural network deep learning:
    * autocorrelational - unsupervised learning for extracting features. Describe how additional layers build complexity in the feature extraction.
    * convolutional - how to detect shift invariant patterns in various data sources. Horizontal shift invariant detection applies to signals like speech recognition or IoT data. Horizontal and vertical shift invariance applies to images or videos, for faces or self driving cars
    * discuss details of applying deep net systems for continuous or real time scoring
    * reinforcement learning or Q Learning - such as learning how to play Atari video games
    * continuous space word models - such as word2vec, skipgram training, NLP understanding and translation

    See project
  • Conf: Case Studies Deploying Cluster Analysis

    Three case studies are discussed, that include cluster analysis as a component.
    1) Customer description for a credit card attrition model, to describe how to talk to customers.
    2) Hotel price optimization. Use clusters to find subsets of similar behavior, and optimize prices within each cluster. Use a neural net as the objective function.
    3) Retail supply chain, planning replenishment using 52 week demand curves using thousands of seasonal "profiles" or clusters.

    See project
  • Conference: Heuristic Design of Experiments with Meta-Gradient Search of Model Training Parameters

    Once you have started learning about predictive algorithms, and the basic knowledge discovery in databases process, what is the next level of detail to learn for a consulting project?

    * Give examples of the many model training parameters
    * Track results in a "model notebook"
    * Use a model metric that combines both accuracy and generalization to rank models
    * How to strategically search over the model training parameters - use a gradient descent approach
    * One way to…

    Once you have started learning about predictive algorithms, and the basic knowledge discovery in databases process, what is the next level of detail to learn for a consulting project?

    * Give examples of the many model training parameters
    * Track results in a "model notebook"
    * Use a model metric that combines both accuracy and generalization to rank models
    * How to strategically search over the model training parameters - use a gradient descent approach
    * One way to describe an arbitrarily complex predictive system is by using sensitivity analysis

    See project
  • On conference program committee. Review and accept papers for industry track of ACM/IEEE conference "Data Science and Advanced Analytics", Oct 19-21 in Tokyo

    -

    International Conference on Data Science and Advanced Analytics (DSAA) started in 2014 aiming to be a flagship in the data science and analytics field. It provides a premier forum that brings together researchers, industry practitioners, as well as potential users of data science and big data analytics. It covers all data science and analytics related areas, including statistical, probabilistic and mathematical methods, machine learning, data and business analytics, data mining and knowledge…

    International Conference on Data Science and Advanced Analytics (DSAA) started in 2014 aiming to be a flagship in the data science and analytics field. It provides a premier forum that brings together researchers, industry practitioners, as well as potential users of data science and big data analytics. It covers all data science and analytics related areas, including statistical, probabilistic and mathematical methods, machine learning, data and business analytics, data mining and knowledge discovery, infrastructure, storage, retrieval and search, privacy and security, and relevant applications, practices, tools and evaluation. DSAA’2014 was not a fully IEEE supported conference, but was technically co-sponsored by IEEE Computational Intelligence Society (CIS) and ACM through SIGKDD. DSAA became a fully IEEE CIS supported conference from the second edition. The second IEEE DSAA’2015 was held in Paris in 2015 which was also very successful. The third IEEE DSAA’2016 is planned in Montreal. They continue to be technically sponsored by ACM.

    IEEE DSAA’2017 will consist of two main Tracks: Research and Application; the Research Track is aimed at collecting contributions related to theoretical foundations of Data Science and Data Analytics. The Application Track is aimed at collecting contributions related to applications of Data Science and Data Analytics in real life scenarios. DSAA’2017 solicits then both theoretical and practical works on data science and advanced analytics.

    See project
  • Develop and teach 8 hour course "Predictive Data Science in R"

    -

    Go through a sprint of a predictive data mining project, introducing R as we go. Review the training process for regression, backpropagation neural nets, decision trees and XGboost. Introduce R data.tables and the caret interface to 233 predictive algorithms. Focus on strategies to structure a successful project design and data pull. Review a variety of preprocessing and knowledge representation. Provide questions you can take away and apply to the design of your future projects, to…

    Go through a sprint of a predictive data mining project, introducing R as we go. Review the training process for regression, backpropagation neural nets, decision trees and XGboost. Introduce R data.tables and the caret interface to 233 predictive algorithms. Focus on strategies to structure a successful project design and data pull. Review a variety of preprocessing and knowledge representation. Provide questions you can take away and apply to the design of your future projects, to describe models to clients (sensitivity analysis code included) and to manage models over their natural lifecycle. Introduce R + Spark integrations, and show an example R Shiny web GUI interface.

    See project
  • Automatic Model Building (had a patent application)

    -

    How can the process of Knowledge Discovery in Databases be automated, competitive and reliable? One approach is to focus on a narrow vertical market application, with known data sources and data feeds. Then you can automate the Exploratory Data Analysis (EDA) and Preprocessing phases. But how do you automate the selection of training data? Can the enterprise application be installed and configured at a variety of clients without a Senior Knowledge Discovery Engineer? How can you minimize…

    How can the process of Knowledge Discovery in Databases be automated, competitive and reliable? One approach is to focus on a narrow vertical market application, with known data sources and data feeds. Then you can automate the Exploratory Data Analysis (EDA) and Preprocessing phases. But how do you automate the selection of training data? Can the enterprise application be installed and configured at a variety of clients without a Senior Knowledge Discovery Engineer? How can you minimize "worst case" results of such a system when used by a business user going through their normal business role? How can you deeply investigate and model "business values" (i.e. things that can get an end user promoted or fired) into the core of the data mining algorithms?

    This talk will answer these questions and more. The patent-pending application, ELF, is an enterprise application in the retail supply chain vertical market. Before the development of this system, one enterprise application was used to lay out a weekly newspaper flier three weeks before the sales event, which in turn fed data into a replenishment application. The replenishment application kept products on the store shelves, with a minimal amount of over stock and under stock. The pain point was that the retail buyer would have to manually estimate the the sales lift, or the multiplier increase in sales, for every item for every store. While human expertise can be great, it isn\'t as scalable when applied to a sales event with 1,000 - 4,000 items on sale in 6,000 stores. ELF (Event Lift Forecasting) would import data from a planned event and automatically analyze and forecast the lift for each store-item combination. Data elements used included pricing, placement in the flier, store geography and demographics, seasonality, and product hierarchy.

    The resulting ELF system produced a 8-30% reduction in over and under stock costs, significant for the supply chain industry.

    Other creators
    See project

Recommendations received

More activity by Greg

View Greg’s full profile

  • See who you know in common
  • Get introduced
  • Contact Greg directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Greg Makowski in United States

Add new skills with these courses