FFmpeg + Qt 音频文件转PCM数据

0.前言

PCM(Pulse Code Modulation，脉冲编码调制)音频数据是未经压缩的音频采样数据裸流，它是由模拟信号经过采样、量化、编码转换成的标准数字音频数据。

描述 PCM 数据的 6 个参数（参考PCM音频数据 - 简书）：

Sample Rate : 采样频率。如8kHz(电话)、44.1kHz(CD)、48kHz(DVD)。
Sample Size : 量化位数。通常该值为16-bit。
Number of Channels : 通道个数。常见的音频有立体声(stereo)和单声道(mono)两种类型，立体声包含左声道和右声道。另外还有环绕立体声等其它不太常用的类型。
Sign : 表示样本数据是否是有符号位，比如用一字节表示的样本数据，有符号的话表示范围为-128 ~ 127，无符号是0 ~ 255。
Byte Ordering : 字节序。字节序是little-endian还是big-endian。通常均为little-endian。
Integer Or Floating Point : 整形或浮点型。大多数格式的PCM样本数据使用整形表示，而在一些对精度要求高的应用方面，使用浮点类型表示PCM样本数据。

本文通过重采样的方式，将音频文件中的数据转为指定参数格式的 PCM 数据。FFmpeg 中重采样的功能由 libswresample 提供，该模块提供了高度优化的转换音频的采样频率、声道格式或采样格式的功能。如果不转换直接读取文件的 PCM 数据，因为格式比较多，处理起来也挺麻烦，重采样之后便于进一步的处理，如绘制波形等。最终效果：

参考示例：ffmpeg-4.2.4\doc\examples\resampling_audio.c

参考博客：FFmpeg音频重采样API(libswresample) - 简书

参考博客：https://segmentfault.com/a/1190000025145553

本文代码链接（不带FFmpeg库）：MyTestCode/Qt/GetAudioInfo at master · gongjianbo/MyTestCode · GitHub

工程CSDN下载：GetAudioInfo_VS2017x64.rar_qt使用ffmpeg获取pcm数据-编解码文档类资源-CSDN下载

(2021-04-01)之前转码的时候如果导出多声道，只导出了单个声道的数据，现已更正.（因为原本设计的是可以将原本双声道的拆成两个单声道）

(2021-12-27)问题1：之前用的 AVFormatContext 来获取的比特率，如果是视频文件这就不能作为音频的比特率了，所以改为了 AVCodecContext 来获取。不过有些文件的 AVCodecContext 可能获取不到比特率信息，这时候再使用 AVFormatContext 提供的信息。问题2：之前重置缓冲区时没把通道数加入计算，不过预置的大小比较大，一般不会进入重置缓冲区大小的逻辑。

(2022-08-25)修复采样精度信息读取错误的问题，如24bit。因为之前读取的AVSampleFormat枚举只有固定的几种精度，现在通过AVCodecParameters结构体成员来获取实际的采样精度。

1.主要接口 swr_alloc_set_opts

该函数相当于 swr_alloc 加上 swr_alloc_set_opts ，即初始化并设置 SwrContext 参数。对于输入参数，取 AVCodecContext 输入解码器上下文的参数就行了。对于输出参数，可以自己制定，达到编码格式转换的目的。

/**
 * @param s               已有的重采样上下文对象, 或者填 NULL
 * @param out_ch_layout   输出通道布局 (AV_CH_LAYOUT_*)
 * @param out_sample_fmt  输出采样格式，如16位有符号数 (AV_SAMPLE_FMT_*).
 * @param out_sample_rate 输出采样率 (frequency in Hz)
 * @param in_ch_layout    输入通道布局 (AV_CH_LAYOUT_*)
 * @param in_sample_fmt   输入采样格式 (AV_SAMPLE_FMT_*).
 * @param in_sample_rate  输入采样率 (frequency in Hz)
 * @param log_offset      日志等级
 * @param log_ctx         parent logging context, can be NULL
 *
 * @return 返回重采样上下文对象，如果为 NULL 则失败
 */
struct SwrContext *swr_alloc_set_opts(struct SwrContext *s,
                                      int64_t out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate,
                                      int64_t  in_ch_layout, enum AVSampleFormat  in_sample_fmt, int  in_sample_rate,
                                      int log_offset, void *log_ctx);

swr_convert

该函数执行重采样转换。在转换之前，还要通过 av_read_frame+avcodec_send_packet+avcodec_receive_frame 解码原数据，解码完成之后，输入参数的缓冲区就填 AVFrame 的 data，采样数就填 AVFrame 的 nb_samples。

主要需要关注的是输出参数，缓冲区是一个指针数组，如果是 planar 存储形式，则左右声道会分别给两个数组写数据，如 [0]=LLLL [1]=RRRR；如果是 packed 存储形式，则只使用数组 [0]，双声道时左右声道交错写，[0]=LRLRLRLR。那么缓冲区该多大呢？可以使用 av_rescale_rnd 或者 swr_get_out_samples 获取大致的转换后采样数，一般比实际的大一点，再乘上通道数和采样点位宽就得到了需要的缓冲区大小。如果输入的 nb_samles 采样数大于了输出的 nb_samplse 采样数，则 SwrContext 中会有缓存。如果有缓存，swr_get_out_samples 第二个参数填零可以取缓存数据大小，swr_convert 最后两个参数填0可以获取缓存数据。

/** 
 * @param s         有效的重采样上下文对象
 * @param out       输出缓冲区，如果是压缩音频，只需设置第一个缓冲区
 * @param out_count 输出采样数
 * @param in        输入缓冲区
 * @param in_count  输入采样数
 *
 * @return 每个通道的采样数，NULL 则错误
 */
int swr_convert(struct SwrContext *s, uint8_t **out, int out_count,
                                const uint8_t **in , int in_count);

2.主要代码

因为我封装了两个类文件（EasyAudioContext 和 EasyAudioDecoder），所以代码比较长，完整代码请看文首，贴出来主要是便于我以后在线阅读。

（这个 Demo 也比较简单，主要就是调用 FFmpeg 的相关接口）

#pragma once
#include 
#include 

//在头文件导入只是偷个懒
extern "C" {
#include 
#include 
#include 
#include 
#include 
#include 
}

/**
 * @brief 存储音频格式等信息
 */
struct EasyAudioInfo
{
    bool valid=false;

    //下面为音频文件相关信息
    QString filepath;
    QString filename;
    QString encode;
    int sampleRate = 0;
    int channels = 0;
    int sampleBit = 0;
    qint64 duration = 0;
    qint64 bitRate = 0;
    qint64 size = 0;
    QString type;
};

/**
 * @brief 存储输入或者输出用的参数
 */
struct EasyAudioParameter
{
    //通道数
    int channels=1;
    //采样存储格式，对应枚举AVSampleFormat，AV_SAMPLE_FMT_S16=1
    AVSampleFormat sampleFormat=AV_SAMPLE_FMT_S16;
    //采样率
    int sampleRate=16000;
};

/**
 * @brief 管理音频上下文，也可用来获取音频格式等信息
 * @author 龚建波
 * @date 2020-11-20
 * @details
 * 去掉了拷贝和赋值，需要作为参数传递时使用智能指针管理
 * （为什么用 NULL 不用 nullptr，为了和 C 保持一致）
 *
 * 内存管理参考：
 * https://www.jianshu.com/p/9f45d283d904
 * https://blog.csdn.net/leixiaohua1020/article/details/41181155
 *
 * 获取音频信息参考：
 * https://blog.csdn.net/zhoubotong2012/article/details/79340722
 * https://blog.csdn.net/luotuo44/article/details/54981809
 *
 */
class EasyAudioContext
{
private:
    //判断解析状态，只有Success才表示成功
    enum EasyState{
        None //无效的
        ,Success //解析成功
        ,NoFile //文件不存在
        ,FormatOpenError //打开文件失败
        ,FindStreamError //读取流信息失败
        ,NoAudio //未找到音频流
        ,CodecFindDecoderError //未找到解码器
        ,CodecAllocContextError //分配解码上下文失败
        ,ParameterError //填充解码上下文失败
        ,CodecOpenError //打开解码器失败
    };
public:
    //根据文件创建上下文对象
    explicit EasyAudioContext(const QString &filepath);
    //去掉了拷贝和赋值，使用智能指针管理
    EasyAudioContext(const EasyAudioContext &other)=delete;
    EasyAudioContext &operator =(const EasyAudioContext &other)=delete;
    EasyAudioContext(EasyAudioContext &&other)=delete;
    EasyAudioContext &operator =(EasyAudioContext &&other)=delete;
    //析构中释放资源
    ~EasyAudioContext();

    //是否为有效的上下文
    bool isValid() const;
    //获取该上下文音频格式等信息
    EasyAudioInfo getAudioInfo() const;
    //获取该上下文参数信息
    EasyAudioParameter getAudioParameter() const;

private:
    //根据文件初始化上下文
    void init(const QString &filepath);
    //释放资源
    void free();

private:
    //源文件路径
    QString srcpath;
    //该上下文是否有效，默认无效
    EasyState status=None;

    //格式化I/O上下文
    AVFormatContext *formatCtx = NULL;
    //解码器
    AVCodec *codec = NULL;
    //解码器上下文
    AVCodecContext *codecCtx = NULL;
    //参数信息
    AVCodecParameters *codecParam = NULL;
    //音频流index
    int streamIndex = -1;

    //在友元中访问私有变量用
    friend class EasyAudioDecoder;
};

#include "EasyAudioContext.h"

#include 
#include 

extern "C" {
#include 
#include 
#include 
#include "libswresample/swresample.h"
#include 
#include 
}

EasyAudioContext::EasyAudioContext(const QString &filepath)
{
    init(filepath);
}

EasyAudioContext::~EasyAudioContext()
{
    free();
}

bool EasyAudioContext::isValid() const
{
    return (status==EasyState::Success);
}

EasyAudioInfo EasyAudioContext::getAudioInfo() const
{
    EasyAudioInfo info;

    //把需要的格式信息copy过来
    info.valid = isValid();
    info.filepath = srcpath;
    QFileInfo f_info(srcpath);
    info.filename = f_info.fileName();
    info.size = f_info.size();

    if(!isValid())
        return info;

    info.encode = codec->name;
    info.sampleRate = codecParam->sample_rate; //hz
    info.channels = codecParam->channels;
    //2022-08-25 之前取的采样精度不是文件实际的精度，导致24bit等不能正确识别
    //info.sampleBit = (av_get_bytes_per_sample(codecCtx->sample_fmt)bits_per_raw_sample > 0) {
    //    info.sampleBit = codecCtx->bits_per_raw_sample;
    //}
    info.duration = formatCtx->duration/(AV_TIME_BASE/1000.0);  //ms
    //2020-12-31 测试一个ape文件时发现音频信息比特率为0，现判断无效则使用容器比特率
    info.bitRate = codecCtx->bit_ratebit_rate:codecCtx->bit_rate; //bps
    info.type = formatCtx->iformat->name;

    return info;
}

EasyAudioParameter EasyAudioContext::getAudioParameter() const
{
    EasyAudioParameter param;

    if(!isValid())
        return param;
    param.channels=codecCtx->channels;
    param.sampleFormat=codecCtx->sample_fmt;
    param.sampleRate=codecCtx->sample_rate;

    return param;
}

void EasyAudioContext::init(const QString &filepath)
{
    srcpath=filepath;
    if(!QFileInfo::exists(filepath)){
        status=EasyState::NoFile;
        return;
    }

    //ffmpeg默认用的utf8编码，这里转换下
    QByteArray temp=filepath.toUtf8();
    const char *path=temp.constData();
    //const char *filepath="D:/Download/12.wav";

    //打开输入流并读取头
    //流要使用avformat_close_input关闭
    //成功时返回=0
    const int result=avformat_open_input(&formatCtx, path, NULL, NULL);
    if (result!=0||formatCtx==NULL){
        status=EasyState::FormatOpenError;
        return;
    }

    //读取文件获取流信息，把它存入AVFormatContext中
    //正常时返回>=0
    if (avformat_find_stream_info(formatCtx, NULL) < 0) {
        status=EasyState::FindStreamError;
        return;
    }

    //qDebug()

FFmpeg + Qt 音频文件转PCM数据

[ 申请 ]友情链接：