Sample Datasets
This folder contains sample datasets with intents and sample
utterances from di erent sources:
Watson/WatsonConversationCarWorkspace.json in the Watson
subfolder is an export of the standard sample workspace for
demoing IBM Watson Conversation service on the IBM Cloud.
PharmacyDataset.json (no need to import)
Embedder — Word one-hot encoding implemented in
Swift using NSLinguisticTagger
This sub-project contains Swift code (to be executed on a macOS or iOS
environments) to import a JSON le containing the dataset to be used
for training the NLC model. This importer uses Apple Foundation
NSLinguisticTagger APIs to analyze and tokenize the text in the sample
utterances, creating a word embedder. In particular, it outputs a one-
hot encoding for stem words, a corpus of documents, and a class of
entities to train the data and prepare the TensorFlow model, as well as
for inferencing the model with Core ML.
Usage example:
Embedder import ../SampleDatasets/PharmacyDataset.json
This command produces the following les in the current folder:
bagOfWords.json, lemmatizedDataset.json and intents.json
ModelNotebook — Instructions to create TensorFlow
model
This is a Python Jupyter Notebook using Keras API and TensorFlow as a
backend to create a simple, fully connected Deep Network Classi er.
If you’re new to the Keras/TensorFlow/Jupyter world, here are step-by-
step instructions to create the ML model using Keras/TensorFlow and
export it on Core ML using CoreMLConversionTool
Download and install Anaconda Python:
https://www.continuum.io/downloads
•
•
1.