Boost库-功能介绍-字符串常用功能-编码转换-格式化为字符串-字符串分割合并-类型转换-字符替换删除裁剪

插件开发发布时间：2022-05-19 08:42:59 ，浏览量：5

文章目录

- 1.编码转换
- 2.格式化为字符串
- 3.字符串分割合并
- 4.类型转换
- 5.字符替换删除和裁剪
- 6.作者答疑

传统的C++提供了基本的字符串处理功能，对于一些常用的功能没有直接的封装，本文从编码转换，格式化字符串，字符串分割合并和类型转换四个方面，讲解boost库中的字符串处理功能。

1.编码转换

在计算机中，为了描述不同文字类型（英文和汉字）和不同的文字内容（汉字中的中字和国字），就需要将这些文字按一定的规则进行编码，方便统一描述，在不同的设备采用相同的编号描述。进而解析编码而获得文字。下文中是可能涉及到的编码名称集合，在编程中需要指定。欧洲语言系 ASCII, ISO-8859-{1,2,3,4,5,7,9,10,13,14,15,16}, KOI8-R, KOI8-U, KOI8-RU, CP{1250,1251,1252,1253,1254,1257}, CP{850,866,1131}, Mac{Roman,CentralEurope,Iceland,Croatian,Romania}, Mac{Cyrillic,Ukraine,Greek,Turkish}, Macintosh Semitic languages ISO-8859-{6,8}, CP{1255,1256}, CP862, Mac{Hebrew,Arabic} 日语 EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP, ISO-2022-JP-2, ISO-2022-JP-1, ISO-2022-JP-MS 中文 EUC-CN, HZ, GBK, CP936, GB18030, EUC-TW, BIG5, CP950, BIG5-HKSCS, BIG5-HKSCS:2004, BIG5-HKSCS:2001, BIG5-HKSCS:1999, ISO-2022-CN, ISO-2022-CN-EXT 朝鲜文 EUC-KR, CP949, ISO-2022-KR, JOHAB 亚美尼亚 ARMSCII-8 格鲁吉亚 Georgian-Academy, Georgian-PS Tajik KOI8-T 哈萨克斯坦 PT154, RK1048 泰国 ISO-8859-11, TIS-620, CP874, MacThai 老挝 MuleLao-1, CP1133 越南 VISCII, TCVN, CP1258 Platform specifics HP-ROMAN8, NEXTSTEP Full Unicode UTF-8 UCS-2, UCS-2BE, UCS-2LE UCS-4, UCS-4BE, UCS-4LE UTF-16, UTF-16BE, UTF-16LE UTF-32, UTF-32BE, UTF-32LE UTF-7 C99, JAVA Full Unicode, in terms of uint16_t or uint32_t (with machine dependent endianness and alignment) UCS-2-INTERNAL, UCS-4-INTERNAL Locale dependent, in terms of char' or wchar_t’ (with machine dependent endianness and alignment, and with OS and locale dependent semantics) char, wchar_t 最常用的四种是UTF-8 GB2312 GBK UTF-16LE 转换函数，范例代码如下所示：

#include 
#include 
#include 

std::string UTF8toGBK(const std::string & str)
{
    return boost::locale::conv::between(str, "GBK", "UTF-8");
}

std::string GBKtoUTF8(const std::string & str)
{
    return boost::locale::conv::between(str, "UTF-8", "GBK");
}

std::wstring GBKtoUNICODE(const std::string & str)
{
    return boost::locale::conv::to_utf(str, "GBK");
}

std::string UNICODEtoGBK(std::wstring wstr)
{
    return boost::locale::conv::from_utf(wstr, "GBK");
}

std::string UNICODEtoUTF8(const std::wstring& wstr)
{
    return boost::locale::conv::from_utf(wstr, "UTF-8");
}

std::wstring UTF8toUNICODE(const std::string & str)
{
    return boost::locale::conv::utf_to_utf(str);
}

int main()
{
	std::string source = "18928899728-广州知了软件有限公司";
	//std::string between(std::string const &text, std::string const &to_encoding, std::string const &from_encoding, method_type how = default_method);
	std::string s = boost::locale::conv::between(source, "UTF-8", "GB2312");//目标编码名称和源编码名称
	return 0;
}

2.格式化为字符串

boost::format( "format-string ") % arg1 % arg2 % … % argN ; 注意这里没有示例对象，format-string代表需要格式化的字符串，后面用重载过的%跟参数。

//在format-string中，%X%表示占位符，%1%就是第一个占位符，%2%就是第二个，后面类推，再后面的%"xxx"就对应着每个占位符，也就是说如果我们写成：
boost::format fmt("%1% \n %2% \n %3%" )%"first"%"second"%"third";
std::string st=fmt.str();

下面展示一个简单数字格式的例子：

#include 
#include 
int main()
{
	//浮点转字符串和整数保持6位
	std::cout

关注

打赏

1688896170

查看更多评论

Boost库-功能介绍-字符串常用功能-编码转换-格式化为字符串-字符串分割合并-类型转换-字符替换删除裁剪

[ 申请 ]友情链接：