目录
使用通用语句编码器设置TensorFlow.js代码
TriviaQA数据集
通用句子编码器
聊天机器人在行动
终点线
下一步是什么?
- 下载项目代码-9.9 MB
TensorFlow + JavaScript。现在,最流行,最先进的AI框架支持地球上使用最广泛的编程语言。因此,让我们在Web浏览器中通过深度学习使文本和NLP(自然语言处理)聊天机器人神奇地发生,使用TensorFlow.js通过WebGL加速GPU!
我们的聊天专家聊天机器人的第1版使用递归神经网络(RNN)构建,存在一些缺点和局限性,这使得它常常无法预测匹配的聊天问题以提供答案,除非问题被逐字询问出现在数据库中。RNN学会根据序列进行预测,但他们不一定知道序列的哪些部分最重要。
这是转换器可以派上用场的地方。我们在上一篇文章中讨论了转换器。在那里,我们展示了他们如何帮助改善我们的情绪探测器。现在,让我们看看他们可以为聊天聊天机器人做什么。
使用通用语句编码器设置TensorFlow.js代码该项目与第一个聊天专家代码非常相似,因此让我们以初始代码库为起点,去掉单词嵌入、模型和预测部分。我们将在此处添加一个重要且功能强大的库,即通用句子编码器(USE),它是一种经过预先训练的基于转换器的语言处理模型。这就是我们用来确定聊天机器人匹配的聊天问题的内容。我们还将在USE自述文件示例中添加两个实用程序函数dotProduct和zipWith,以帮助我们确定句子的相似性。
Trivia Know-It-All: Chatbots in the Browser with TensorFlow.js
Trivia Know-It-All Bot
Ask a trivia question:
Submit
function setText( text ) {
document.getElementById( "status" ).innerText = text;
}
// Calculate the dot product of two vector arrays.
const dotProduct = (xs, ys) => {
const sum = xs => xs ? xs.reduce((a, b) => a + b, 0) : undefined;
return xs.length === ys.length ?
sum(zipWith((a, b) => a * b, xs, ys))
: undefined;
}
// zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
const zipWith =
(f, xs, ys) => {
const ny = ys.length;
return (xs.length f(x, ys[i]));
}
(async () => {
// Load TriviaQA data
let triviaData = await fetch( "web/verified-wikipedia-dev.json" ).then( r => r.json() );
let data = triviaData.Data;
// Process all QA to map to answers
let questions = data.map( qa => qa.Question );
// Load the universal sentence encoder
setText( "Loading USE..." );
let encoder = await use.load();
setText( "Loaded!" );
const model = await use.loadQnA();
document.getElementById( "question" ).addEventListener( "keyup", function( event ) {
// Number 13 is the "Enter" key on the keyboard
if( event.keyCode === 13 ) {
// Cancel the default action, if needed
event.preventDefault();
// Trigger the button element with a click
document.getElementById( "submit" ).click();
}
});
document.getElementById( "submit" ).addEventListener( "click", async function( event ) {
let text = document.getElementById( "question" ).value;
document.getElementById( "question" ).value = "";
// Run the calculation things
const input = {
queries: [ text ],
responses: questions
};
// console.log( input );
let embeddings = await model.embed( input );
tf.tidy( () => {
const embed_query = embeddings[ "queryEmbedding" ].arraySync();
const embed_responses = embeddings[ "responseEmbedding" ].arraySync();
let scores = [];
embed_responses.forEach( response => {
scores.push( dotProduct( embed_query[ 0 ], response ) );
});
// Get the index of the highest value in the prediction
let id = scores.indexOf( Math.max( ...scores ) );
document.getElementById( "bot-question" ).innerText = questions[ id ];
document.getElementById( "bot-answer" ).innerText = data[ id ].Answer.Value;
});
embeddings.queryEmbedding.dispose();
embeddings.responseEmbedding.dispose();
});
})();
TriviaQA数据集
我们将用于改进的聊天专家聊天机器人的数据与以前相同,即由华盛顿大学提供的TriviaQA数据集。它包括9.5万个聊天问答对,但是为了使其更简单,训练更快,我们将使用较小的子集verified-wikipedia-dev.json,该子集包含在该项目的示例代码中。
通用句子编码器通用编码器句(USE)是“[预先训练]模型编码文本转换成512维的嵌入”。有关USE及其体系结构的完整说明,请参见上一篇文章。
USE易于使用。在定义网络模型并使用其QnA双编码器之前,让我们在代码中加载它,这将为我们提供所有查询和所有答案的全句嵌入。
// Load the universal sentence encoder
setText( "Loading USE..." );
let encoder = await use.load();
setText( "Loaded!" );
const model = await use.loadQnA();
聊天机器人在行动
因为句子嵌入已经将相似性编码到其向量中,所以我们不需要训练其他模型。我们需要做的只是找出哪些聊天问题与用户提交的问题最相似。让我们通过使用QnA编码器并找到最佳问题来做到这一点。
document.getElementById( "submit" ).addEventListener( "click", async function( event ) {
let text = document.getElementById( "question" ).value;
document.getElementById( "question" ).value = "";
// Run the calculation things
const input = {
queries: [ text ],
responses: questions
};
// console.log( input );
let embeddings = await model.embed( input );
tf.tidy( () => {
const embed_query = embeddings[ "queryEmbedding" ].arraySync();
const embed_responses = embeddings[ "responseEmbedding" ].arraySync();
let scores = [];
embed_responses.forEach( response => {
scores.push( dotProduct( embed_query[ 0 ], response ) );
});
// Get the index of the highest value in the prediction
let id = scores.indexOf( Math.max( ...scores ) );
document.getElementById( "bot-question" ).innerText = questions[ id ];
document.getElementById( "bot-answer" ).innerText = data[ id ].Answer.Value;
});
embeddings.queryEmbedding.dispose();
embeddings.responseEmbedding.dispose();
});
如果一切顺利,您会注意到,现在我们有了一个性能出色的聊天机器人,可以仅用一个或两个关键字就可以得到适当的聊天问答。
为了结束这个项目,下面是完整的代码:
Trivia Know-It-All: Chatbots in the Browser with TensorFlow.js
Trivia Know-It-All Bot
Ask a trivia question:
Submit
function setText( text ) {
document.getElementById( "status" ).innerText = text;
}
// Calculate the dot product of two vector arrays.
const dotProduct = (xs, ys) => {
const sum = xs => xs ? xs.reduce((a, b) => a + b, 0) : undefined;
return xs.length === ys.length ?
sum(zipWith((a, b) => a * b, xs, ys))
: undefined;
}
// zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
const zipWith =
(f, xs, ys) => {
const ny = ys.length;
return (xs.length f(x, ys[i]));
}
(async () => {
// Load TriviaQA data
let triviaData = await fetch( "web/verified-wikipedia-dev.json" ).then( r => r.json() );
let data = triviaData.Data;
// Process all QA to map to answers
let questions = data.map( qa => qa.Question );
// Load the universal sentence encoder
setText( "Loading USE..." );
let encoder = await use.load();
setText( "Loaded!" );
const model = await use.loadQnA();
document.getElementById( "question" ).addEventListener( "keyup", function( event ) {
// Number 13 is the "Enter" key on the keyboard
if( event.keyCode === 13 ) {
// Cancel the default action, if needed
event.preventDefault();
// Trigger the button element with a click
document.getElementById( "submit" ).click();
}
});
document.getElementById( "submit" ).addEventListener( "click", async function( event ) {
let text = document.getElementById( "question" ).value;
document.getElementById( "question" ).value = "";
// Run the calculation things
const input = {
queries: [ text ],
responses: questions
};
// console.log( input );
let embeddings = await model.embed( input );
tf.tidy( () => {
const embed_query = embeddings[ "queryEmbedding" ].arraySync();
const embed_responses = embeddings[ "responseEmbedding" ].arraySync();
let scores = [];
embed_responses.forEach( response => {
scores.push( dotProduct( embed_query[ 0 ], response ) );
});
// Get the index of the highest value in the prediction
let id = scores.indexOf( Math.max( ...scores ) );
document.getElementById( "bot-question" ).innerText = questions[ id ];
document.getElementById( "bot-answer" ).innerText = data[ id ].Answer.Value;
});
embeddings.queryEmbedding.dispose();
embeddings.responseEmbedding.dispose();
});
})();
下一步是什么?
既然我们已经学会了创建一个知识聊天机器人,那么带有更多灯光、摄像头和动作的东西又如何呢?让我们创建一个可以与之对话的聊天机器人。
在本系列的下一篇文章中,使用TensorFlow.js在浏览器中的电影对话聊天机器人中与我一起构建。
https://www.codeproject.com/Articles/5282692/AI-Chatbots-With-TensorFlow-js-Improved-Trivia-Exp