Skip to content

Releases: sql-machine-learning/sqlflow

Release v0.4.2

28 Aug 12:00
0ac724a
Compare
Choose a tag to compare
Release v0.4.2 Pre-release
Pre-release

Major Features and Improvements

  • Add three pre-made Runnable: extract_ts_features (extract time series features using tsfresh), binning and psi
  • Model Meta Design: get the model metadata (such as docker image name for TO TRAIN, model type and so on) when generating prediction workflow step code
  • Distinguish XGBoost model when generating prediction workflow code
  • Support config https for jupyterhub

Refactorization

  • Implement the end-to-end workflow of XGBoost prediction and evaluation
  • Implement predict and explain in Alisa submitter at runtime
  • Unify the API of local and PAI submitter
  • Simplify HDFS parameters

Bug Fixes

  • Fix titanic Maxcompute dataset importing when FLOAT data type is not enabled
  • Fix generate Couler evaluate step in workflow mode.
  • Fix paiio reading table bug when running TO EXPLAIN on PAI.
  • Fix XGBoost data compatibility issue: compatible with various CSV format such as a,b,c, and a, b, c, and the string containing /
  • Fix explain issue when SHAP values are not listed

Release v0.4.1

14 Aug 10:30
b5888cc
Compare
Choose a tag to compare
Release v0.4.1 Pre-release
Pre-release

Major Features and Improvements

  • The model zoo can be used in the playground now.
  • CLI supports downloading the model in the model zoo to local.
  • Support the GCN model in the official models repo.
  • CI has been moved to the Github actions. Travis CI was disabled.
  • TO RUN syntax can use the file name instead of using the absolute path.
  • Non-linear optimization problems are supported by the BARON solver.
  • CONSTRAINT clause can be optional in the TO MAXIMIZE|MINIMIZE statement.

Refactorization

  • The end to end XGBoost training on local can run in the workflow mode now.
  • Unify the DBMS APIs by the Connection and ResultSet interfaces in the Python side.

Bug Fixes

  • Fix the bug that XGBoost training cannot have more than 255 feature columns.
  • Fix the bug that the TiDB parser cannot parse the LAG function.

Release v0.4.0

30 Jul 14:28
bb22929
Compare
Choose a tag to compare
Release v0.4.0 Pre-release
Pre-release

Major Features and Improvements

  • The parser can remove all comments now.
  • Support linear programming using pyomo and optflow.
  • Add Model zoo default model definitions in image sqlflow/sqlflow.
  • Support custom train loop, predict sample, evaluation loop in custom models.
  • Move CI jobs from Travis to GitHub actions to use a pre-setup environment to speed up the build and test.
  • Add SQLFlow Playground where users can get a quick experience of SQLFlow.

Refactorization

  • WIP: refactoring sqlflow_submitter to runtime. The runtime library supports feature derivation, statement verifier, job submitters to various platforms, and executes the workflow step then saves the model into the database.
  • Remove is_pai conditions in runtime.tensorflow package and move corresponding code runs on PAI to runtime.pai.

Bug Fixes

  • Fix size calculation in fillCSVFieldDesc is always 0 in feature derivation.

Release v0.3.0-rc.1

22 Jun 13:27
ea036f5
Compare
Choose a tag to compare
Release v0.3.0-rc.1 Pre-release
Pre-release

Major Features and Improvements


  • Support TO EVALUATE clause to evaluate a model.
  • SQLFlow model zoo, support publicly share model definitions and models.
  • Support mathematical programming using SQL.
  • Support feature column in the XGBoost model, including training, evaluating, prediction, and explaining.
  • Support incremental training for both TensorFlow and XGBoost models.
  • Add logs to record runtime status.
  • Command-line Tool support release/remove model/repo .
  • Support SHOW TRAIN statement go get original SQL.
  • Create the SQLFlow Playground as a quick-start environment.

Improvements

  • Improve the user experience on workflow mode, including improving workflow log structure, return selected rows, and diagnostic message to the GUI system.
  • Improve some diagnostic messages on the workflow model.
  • Supports passing all the selected columns into the prediction result table.
  • Decompose the all-in-one Docker image into separated Docker images.

Release v0.2.0-rc.1

16 Jan 09:37
8f936f0
Compare
Choose a tag to compare
Release v0.2.0-rc.1 Pre-release
Pre-release

Major Features and Improvements

  1. Support parsing on SQL programs and arbitrary select statement in extended syntax. #1126
  2. Support feature derivation. #705
  3. Support high available SQLFlow server by submitting SQL programs to Kubernetes clusters as a workflow. #1066
  4. Enhanced REPL functionality.
  5. Support more training configurations:
    1. Support configuring optimizers for Tensorflow Estimator models.
    2. Support configuring optimizers and losses for custom Keras models.
    3. Support configuring metrics for training Tensorflow Estimator models and Keras models.
  6. Support explaining TensorFlow BoostedTrees models.
  7. Support writing EXPLAIN results to a table.

Breaking changes:

  1. We update the syntax extension from appending TRAIN/PREDICT/ANALYZE to TO TRAIN/PREDICT/EXPLAIN. #998
  2. Removed ALPS and ElasticDL code generators to adapt current intermediate representation implementation.

Release v0.1.0-rc.1

16 Sep 05:26
db95106
Compare
Choose a tag to compare
Release v0.1.0-rc.1 Pre-release
Pre-release

SQLFlow release v0.1.0-rc.1 is the first release candidate of SQLFlow.

The current version includes the following features:

  • Database Support
  • Machine Learning Systems and Models Support
  • Feature Columns Supported When Using Tensorflow or Keras Models:
    • numeric_column
    • bucket_column
    • cross_column
    • category_id_column
    • sequence_category_id_column
  • Column Data Type Support:
    • FLOAT/INT/BIGINT
    • VARCHAR/TEXT
      • CSV formatted DENSE Tensor
      • CSV formatted SPARSE Tensor
  • Support Standalone Deployment and Session support: #531
  • Deploy on Kubernetes Cluster: #537
  • Unsupervised Training with Clustering Model: #737
  • Analyze the Machine Learning Mode: analyzer_design.md