您当前的位置: 首页 >  机器人

寒冰屋

暂无认证

  • 0浏览

    0关注

    2286博文

    0收益

  • 0浏览

    0点赞

    0打赏

    0留言

私信
关注
热门博文

使用TensorFlow.js的AI聊天机器人一:检测文本中的情绪

寒冰屋 发布时间:2020-12-28 21:15:08 ,浏览量:0

目录

设置TensorFlow.js代码

GoEmotion数据集

言语包

训练AI模型

检测文本中的情绪

终点线

下一步是什么?

  • 下载项目代码-9.9 MB

TensorFlow + JavaScript。现在,最流行、最先进的AI框架支持地球上使用最广泛的编程语言。因此,让我们在Web浏览器中通过深度学习使文本和NLP(自然语言处理)聊天机器人神奇地发生,使用TensorFlow.js通过WebGL加速GPU!

婴儿学习第一个单词时,不会在字典中查询其含义;他们与表情产生情感联系。识别语音中的情感是理解自然语言的关键。我们如何教计算机通过深度学习的力量来确定句子中的情感?

我假设您熟悉Tensorflow.js,并且可以轻松地使用它创建和训练神经网络。

如果您是TensorFlow.js的新手,建议您首先阅读一下指南,即使用TensorFlow.js在浏览器中进行深度学习入门。

设置TensorFlow.js代码

该项目将完全在网页中运行。这是一个包含TensorFlow.js的入门模板页面,并为我们的代码保留了一部分。让我们在此页面上添加两个文本元素以显示情绪检测,以及稍后将需要的两个实用程序功能。


    
        Detecting Emotion in Text: Chatbots in the Browser with TensorFlow.js
        
    
    
        

Loading... function setText( text ) { document.getElementById( "status" ).innerText = text; } function shuffleArray( array ) { for( let i = array.length - 1; i > 0; i-- ) { const j = Math.floor( Math.random() * ( i + 1 ) ); [ array[ i ], array[ j ] ] = [ array[ j ], array[ i ] ]; } } (async () => { // Your Code Goes Here })();
GoEmotion数据集

我们将用于训练神经网络的数据来自Google Research GitHub存储库中的GoEmotions数据集。它由58个英文Reddit评论(包含27种情感类别)组成。如果愿意,您可以使用全套训练,但是我们只需要为该项目提供一小部分子集,因此下载此较小的测试集就足够了。

将文件放在项目文件夹中,您的网页可以在其中从本地Web服务器检索该"web"文件。

在脚本的顶部,定义一个情感类别列表,该列表将用于训练和预测:

const emotions = [
    "admiration",
    "amusement",
    "anger",
    "annoyance",
    "approval",
    "caring",
    "confusion",
    "curiosity",
    "desire",
    "disappointment",
    "disapproval",
    "disgust",
    "embarrassment",
    "excitement",
    "fear",
    "gratitude",
    "grief",
    "joy",
    "love",
    "nervousness",
    "optimism",
    "pride",
    "realization",
    "relief",
    "remorse",
    "sadness",
    "surprise",
    "neutral"
];

我们下载的测试集.tsv文件包含文本行,每个文本行都包含制表符分隔的元素:句子、情感类别标识符和唯一的句子标识符。我们可以像这样加载数据并随机化代码中的文本行:

(async () => {
            // Load GoEmotions data (https://github.com/google-research/google-research/tree/master/goemotions)
            let data = await fetch( "web/emotions.tsv" ).then( r => r.text() );
            let lines = data.split( "\n" ).filter( x => !!x ); // Split & remove empty lines

            // Randomize the lines
            shuffleArray( lines );
})();
言语包

在将句子传递到神经网络之前,需要将它们转换为一组数字。

一个经典、简单的方法是拥有我们希望使用的完整单词词汇表,并创建一个长度等于单词表列表大小的向量,其中每个分量都映射到列表中的单词之一。然后,对于句子中的每个唯一单词,我们可以将匹配部分设置为1,其余部分设置为0。

例如,如果你使用词汇表映射到[ "deep","learning","in","the","browser","detect","emotion"],那么句子“detect emotion in my browser”将生成一个向量[ 0, 0, 1, 0, 1, 1, 1 ]。

在我们的代码中,我们将从经过改组的经过解析的文本集中提取200条示例行,并使用它创建一个词汇表,并生成用于训练的向量。让我们还生成预期的输出分类向量,这些向量映射到句子的情感类别。

// Process 200 lines to generate a "bag of words"
const numSamples = 200;
let bagOfWords = {};
let allWords = [];
let wordReference = {};
let sentences = lines.slice( 0, numSamples ).map( line => {
    let sentence = line.split( "\t" )[ 0 ];
    return sentence;
});

sentences.forEach( s => {
    let words = s.replace(/[^a-z ]/gi, "").toLowerCase().split( " " ).filter( x => !!x );
    words.forEach( w => {
        if( !bagOfWords[ w ] ) {
            bagOfWords[ w ] = 0;
        }
        bagOfWords[ w ]++; // Counting occurrence just for word frequency fun
    });
});

allWords = Object.keys( bagOfWords );
allWords.forEach( ( w, i ) => {
    wordReference[ w ] = i;
});

// Generate vectors for sentences
let vectors = sentences.map( s => {
    let vector = new Array( allWords.length ).fill( 0 );
    let words = s.replace(/[^a-z ]/gi, "").toLowerCase().split( " " ).filter( x => !!x );
    words.forEach( w => {
        if( w in wordReference ) {
            vector[ wordReference[ w ] ] = 1;
        }
    });
    return vector;
});

let outputs = lines.slice( 0, numSamples ).map( line => {
    let categories = line.split( "\t" )[ 1 ].split( "," ).map( x => parseInt( x ) );
    let output = [];
    for( let i = 0; i < emotions.length; i++ ) {
        output.push( categories.includes( i ) ? 1 : 0 );
    }
    return output;
});
训练AI模型

现在是有趣的部分。我们可以定义一个具有三个隐藏层的模型,从而得到长度为27(情感类别的数量)的分类矢量,其中最大值的索引是我们预测的情感标识符。

// Define our model with several hidden layers
const model = tf.sequential();
model.add(tf.layers.dense( { units: 100, activation: "relu", inputShape: [ allWords.length ] } ) );
model.add(tf.layers.dense( { units: 50, activation: "relu" } ) );
model.add(tf.layers.dense( { units: 25, activation: "relu" } ) );
model.add(tf.layers.dense( {
    units: emotions.length,
    activation: "softmax"
} ) );

model.compile({
    optimizer: tf.train.adam(),
    loss: "categoricalCrossentropy",
    metrics: [ "accuracy" ]
});

最后,我们可以将输入数据转换为张量并训练网络。

const xs = tf.stack( vectors.map( x => tf.tensor1d( x ) ) );
const ys = tf.stack( outputs.map( x => tf.tensor1d( x ) ) );
await model.fit( xs, ys, {
    epochs: 50,
    shuffle: true,
    callbacks: {
        onEpochEnd: ( epoch, logs ) => {
            setText( `Training... Epoch #${epoch} (${logs.acc})` );
            console.log( "Epoch #", epoch, logs );
        }
    }
} );
检测文本中的情绪

是时候让AI发挥其魔力了。

为了测试经过训练的网络,我们将从整个列表中随机选择一行文本,并从一袋单词中生成输入向量,然后将其传递给模型以预测类别。这部分代码将在5秒钟的计时器上运行,以每次加载新的一行文本。

 

// Test prediction every 5s
setInterval( async () => {
    // Pick random text
    let line = lines[ Math.floor( Math.random() * lines.length ) ];
    let sentence = line.split( "\t" )[ 0 ];
    let categories = line.split( "\t" )[ 1 ].split( "," ).map( x => parseInt( x ) );
    document.getElementById( "text" ).innerText = sentence;

    // Generate vectors for sentences
    let vector = new Array( allWords.length ).fill( 0 );
    let words = sentence.replace(/[^a-z ]/gi, "").toLowerCase().split( " " ).filter( x => !!x );
    words.forEach( w => {
        if( w in wordReference ) {
            vector[ wordReference[ w ] ] = 1;
        }
    });

    let prediction = await model.predict( tf.stack( [ tf.tensor1d( vector ) ] ) ).data();
    // Get the index of the highest value in the prediction
    let id = prediction.indexOf( Math.max( ...prediction ) );
    setText( `Result: ${emotions[ id ]}, Expected: ${emotions[ categories[ 0 ] ]}` );
}, 5000 );

终点线

这是完整的代码供参考:


    
        Detecting Emotion in Text: Chatbots in the Browser with TensorFlow.js
        
    
    
        

Loading... const emotions = [ "admiration", "amusement", "anger", "annoyance", "approval", "caring", "confusion", "curiosity", "desire", "disappointment", "disapproval", "disgust", "embarrassment", "excitement", "fear", "gratitude", "grief", "joy", "love", "nervousness", "optimism", "pride", "realization", "relief", "remorse", "sadness", "surprise", "neutral" ]; function setText( text ) { document.getElementById( "status" ).innerText = text; } function shuffleArray( array ) { for( let i = array.length - 1; i > 0; i-- ) { const j = Math.floor( Math.random() * ( i + 1 ) ); [ array[ i ], array[ j ] ] = [ array[ j ], array[ i ] ]; } } (async () => { // Load GoEmotions data (https://github.com/google-research/google-research/tree/master/goemotions) let data = await fetch( "web/emotions.tsv" ).then( r => r.text() ); let lines = data.split( "\n" ).filter( x => !!x ); // Split & remove empty lines // Randomize the lines shuffleArray( lines ); // Process 200 lines to generate a "bag of words" const numSamples = 200; let bagOfWords = {}; let allWords = []; let wordReference = {}; let sentences = lines.slice( 0, numSamples ).map( line => { let sentence = line.split( "\t" )[ 0 ]; return sentence; }); sentences.forEach( s => { let words = s.replace(/[^a-z ]/gi, "").toLowerCase().split( " " ).filter( x => !!x ); words.forEach( w => { if( !bagOfWords[ w ] ) { bagOfWords[ w ] = 0; } bagOfWords[ w ]++; // Counting occurrence just for word frequency fun }); }); allWords = Object.keys( bagOfWords ); allWords.forEach( ( w, i ) => { wordReference[ w ] = i; }); // Generate vectors for sentences let vectors = sentences.map( s => { let vector = new Array( allWords.length ).fill( 0 ); let words = s.replace(/[^a-z ]/gi, "").toLowerCase().split( " " ).filter( x => !!x ); words.forEach( w => { if( w in wordReference ) { vector[ wordReference[ w ] ] = 1; } }); return vector; }); let outputs = lines.slice( 0, numSamples ).map( line => { let categories = line.split( "\t" )[ 1 ].split( "," ).map( x => parseInt( x ) ); let output = []; for( let i = 0; i < emotions.length; i++ ) { output.push( categories.includes( i ) ? 1 : 0 ); } return output; }); // Define our model with several hidden layers const model = tf.sequential(); model.add(tf.layers.dense( { units: 100, activation: "relu", inputShape: [ allWords.length ] } ) ); model.add(tf.layers.dense( { units: 50, activation: "relu" } ) ); model.add(tf.layers.dense( { units: 25, activation: "relu" } ) ); model.add(tf.layers.dense( { units: emotions.length, activation: "softmax" } ) ); model.compile({ optimizer: tf.train.adam(), loss: "categoricalCrossentropy", metrics: [ "accuracy" ] }); const xs = tf.stack( vectors.map( x => tf.tensor1d( x ) ) ); const ys = tf.stack( outputs.map( x => tf.tensor1d( x ) ) ); await model.fit( xs, ys, { epochs: 50, shuffle: true, callbacks: { onEpochEnd: ( epoch, logs ) => { setText( `Training... Epoch #${epoch} (${logs.acc})` ); console.log( "Epoch #", epoch, logs ); } } } ); // Test prediction every 5s setInterval( async () => { // Pick random text let line = lines[ Math.floor( Math.random() * lines.length ) ]; let sentence = line.split( "\t" )[ 0 ]; let categories = line.split( "\t" )[ 1 ].split( "," ).map( x => parseInt( x ) ); document.getElementById( "text" ).innerText = sentence; // Generate vectors for sentences let vector = new Array( allWords.length ).fill( 0 ); let words = sentence.replace(/[^a-z ]/gi, "").toLowerCase().split( " " ).filter( x => !!x ); words.forEach( w => { if( w in wordReference ) { vector[ wordReference[ w ] ] = 1; } }); let prediction = await model.predict( tf.stack( [ tf.tensor1d( vector ) ] ) ).data(); // Get the index of the highest value in the prediction let id = prediction.indexOf( Math.max( ...prediction ) ); setText( `Result: ${emotions[ id ]}, Expected: ${emotions[ categories[ 0 ] ]}` ); }, 5000 ); })();
下一步是什么?

在本文中,您学习了如何使用浏览器中的TensorFlow训练一个AI模型,该模型可以为任何英语句子计算27种情绪之一。尝试将其numSamples从200增加到1000,甚至是整个列表,然后看看您的情绪检测器是否可以提高其准确性。现在,如果我们想让我们的神经网络解析文本并将其分类为27个以上呢?

请继续阅读本系列的下一篇文章中,使用TensorFlow.js在浏览器中训练Trivia Expert Chatbot!

https://www.codeproject.com/Articles/5282687/AI-Chatbots-With-TensorFlow-js-Detecting-Emotion-i

关注
打赏
1665926880
查看更多评论
立即登录/注册

微信扫码登录

0.1091s