VS下使用多字符集编码和Unicode字符集编码的总结

AfxMessageBox("编码类型不正确!");

改为

AfxMessageBox(_T("编码类型不正确!"));

strTest.Replace(" ", "");

改为

strTest.Replace(_T(" "), _T(""));

WORD cbBufMax = 2000; WORD cbBufOut; #ifdef UNICODE char szBuf[2001]; char *pszBuf = szBuf; // Get the names of the installed drivers ("odbcinst.h" has to be included ) if(!SQLGetInstalledDrivers(szBuf, cbBufMax, & cbBufOut)) { m_sExcelDriver = ""; } #else wchar_t szBuf[2001]; wchar_t *pszBuf = szBuf; // Get the names of the installed drivers ("odbcinst.h" has to be included ) if (!SQLGetInstalledDrivers(szBuf, cbBufMax, &cbBufOut)) { m_sExcelDriver = _T(""); } #endif

对于一些操作字符和字符串的库函数，也是有区别的：

#ifdef UNICODE // Search for the driver... do { if (wcsstr(pszBuf, _T("Excel")) != 0) { // Found ! m_sExcelDriver = CString(pszBuf); break; } pszBuf = wcschr(pszBuf, '\0') + 1; } while (pszBuf[1] != '\0'); #else // Search for the driver... do { if( strstr( pszBuf, "Excel" ) != 0 ) { // Found ! m_sExcelDriver = CString( pszBuf ); break; } pszBuf = strchr( pszBuf, '\0' ) + 1; } while( pszBuf[1] != '\0' );

为了有更好的兼容性，应该选择两个版本通用的函数，比如字符串转长整形，最好使用_tcstol函数来代替使用跟字符集相关的strtol或wcstol函数，类似的还有_ttoi、_ttof之类的转换函数。

在Unicode字符集下写文件的时候，对于长度操作要注意，一个宽字符是要写两个长度的：

 CFile fileSave; CString strGetData(_T("写入测试")); CString strPath(_T("test.txt")); if (!fileSave.Open(strPath, CFile::modeCreate |CFile::modeNoTruncate | CFile::modeWrite)) { return; } wchar_t wch = 0xFEFF; fileSave.Write(&wch, sizeof(wchar_t)); fileSave.Write(strGetData.LockBuffer(), wcslen(strGetData)*2); strGetData.UnlockBuffer(); fileSave.Close();

使用Unicode字符集还有一大问题，就是CString与char之间的相互转换，以下函数就我总结的转换函数，char转CString函数：

CString Char2CString(char *pChar) { int charLen = strlen(pChar); // 计算pChar所指向的字符串大小 int len = MultiByteToWideChar(CP_ACP, 0, pChar, charLen, NULL, 0); // 计算多字节字符的大小 wchar_t *pWChar = new wchar_t[len + 1]; // 为宽字节字符数申请空间 MultiByteToWideChar(CP_ACP, 0, pChar, charLen, pWChar, len); // 多字节编码转换成宽字节编码 pWChar[len] = '\0'; // 将wchar_t数组转换为CString CString str; str.Append(pWChar); delete[] pWChar; return str; }

CString转char*函数：

char* CString2Char(CString str) { DWORD dwCount = str.GetLength(); int len = WideCharToMultiByte(CP_ACP, 0, str, dwCount, NULL, 0, NULL, NULL); // 获取宽字节字符的大小 char *pChar = new char[len + 1]; WideCharToMultiByte(CP_ACP, 0, str, dwCount, (char *)pChar, len, NULL, NULL); // 宽字节编码转换成多字节编码 pChar[len] = '\0'; // 注意以'\0'结束 return pChar; }

从代码可以看出，这里CString转char*需要new一个内存空间，使用后得记得delete掉才行。如果不习惯释放空间，那CString转char时最好不要开辟空间，万一忘记delete就造成内存泄露了，可以写一个改进版的转换函数：

static int CStrintToCharWithoutNew(CString str, char *buf) { int len = WideCharToMultiByte(CP_ACP, 0, str, -1, NULL, 0, NULL, NULL); if (len > 0) WideCharToMultiByte(CP_ACP, 0, str, -1, buf, len, NULL, NULL); buf [len] = '\0'; // 注意以'\0'结束 return len; }

这样使用后就不需要再记得delete了，前提是数组得定义的足够大：

char recvData[200]; CString testdata = _T("测试转换"); len = CStrintToCharWithoutNew(testdata, recvData);

上面的方法虽然基础，但显得麻烦了些，char*转CString只需要使用A2T()或A2W()宏即可：

char cBuf[] = "hello 世界"; CString str = _T(""); USES_CONVERSION; str = A2T(cBuf); // str = A2W(cBuf);

如果项目没包含头文件#include <atlconv.h>需要自己加上，USES_CONVERSION宏一定要放在使用前，否则会报错：

error C2065: “_lpa”: 未声明的标识符

类似的CString转char只需使用T2A或W2A宏即可。对于网上说的char转CString使用Format方法，如下：

char cBuf[] = "hello 世界"; CString str = _T(""); str.Format(_T("%s"), cBuf);

经测试，在Unicode编码下是不行的，cBuf数组就不是宽字符，改为下面写法就可以了：

str.Format(_T("%s"), _T("hello 世界"));

有时函数的参数是指针类型LPTSTR（多字符下是char ，Unicode下实际是wchar_t），那么如何把CString转为LPTSTR呢，下面两种方法都可以：

CString str("hello 世界"); LPTSTR lp = (LPTSTR)(LPCTSTR)str; // LPTSTR lp = str.GetBuffer();

上面的代码在多字符集和Unicode字符集下都可以使用的，不过使用GetBuffer()时，使用完记得调用ReleaseBuffer()释放内存。这里有个细节，细心的读者可能会发现，CString创建的时候没有加_T宏：

CString str(_T("hello 世界"));

这里CString的构造函数自动为我们处理了，所以不用担心编码问题的。说到这里，您一定对_T宏感兴趣了，这个宏到底做了什么呢？在tchar.h文件中可以看到对它的定义，摘录下来如下：

#define _T(x) __T(x) #define _TEXT(x) __T(x) #define __T(x) L x // 编码为 Unicode #define __T(x) x // 编码为 多字符

其实它根据不同的编码环境来转换字符的，在Unicode下，会把字符前面加个L代表宽字符，所以下面的定义只能在Unicode下使用：

wchar_t wBuf[] = _T("hello 世界");

就等同于：

wchar_t wBuf[] = L"hello 世界";

L为宽字符的定义宏，调试时可以发现宽字符变量的值是带L的。而在多字符集下面就报错了，因为转义成了：

wchar_t wBuf[] = "hello 世界;

有时候会见到WCHAR和TCHAR宏，不用慌，其实他们也是wchar_t的变体，类似很多宏定义都可以在WinNT.h文件中找到。说到TCHAR，有必要说明一下，是MFC为了统一字符集操作而定义的类型，它跟_T宏类似，在不同的字符集下有不同的定义：

typedef wchar_t TCHAR; // 编码为 Unicode typedef char TCHAR; // 编码为 多字符

那么对于刚才的问题就有了解决方案，定义数组时如下定义：

TCHAR tBuf[] = _T("hello 世界");

typedef LPCWSTR PCTSTR, LPCTSTR; typedef __nullterminated CONST WCHAR *LPCWSTR, *PCWSTR;

分析可见在Unicode下实际上是WCHAR 类型，而那个库原本在多字符集条件下编译时应该转为LPCSTR（实际上是char）类型的：

typedef __nullterminated CONST CHAR *LPCSTR, *PCSTR;

所以对于对外接口来说，写成通用字符集的参数是很不好的，人家编译时选择的字符集不一定和你一样，很容易造成连接错误的，直接写成char*类型才是正道啊。

发布者：全栈程序员-站长，转载请注明出处：https://javaforall.net/231373.html原文链接：https://javaforall.net

VS下使用多字符集编码和Unicode字符集编码的总结

相关推荐

rgb12v转5v_rgb如何转cmyk不变色

clojure-repl的使用套路

基于stm32门禁系统_老式门禁

python可视化图表(python 显示图片)

query指定范围提取数据_document.getelementbyid().赋值

linux卸载eclipse,ubuntu卸载eclipse[通俗易懂]

发表回复