通过Oracle识别字符串中的中文or字母or数字来介绍全角半角转换函数(to_multi_byte/to_single_byte)在varchar/clob中的使用案例
前言
在日常处理数据的过程中,大家肯定会遇到很多奇奇怪怪的字符,然后还要对这些字符处理,比如你有个需求:识别字符串中的中文或是识别字母或是识别数字,甚至都识别出来然后剔除or保留某些字符汉字或数字。
你去百度了一下相关问题,然后得到的结果大都是用正则 ‘\4E00’ and '\9FA5’来识别中文范围用a-zA-z或0-9或[:digit:][:alpha:]来识别字母或数字。但是如果你的字符串中包含全角字符,那这样是识别不全的!!!那怎么做才能够正确的识别中文、字母、数字呢???那就要考虑先做全半角的转换了,Oracle中用全角半角转换函数(to_multi_byte/to_single_byte)来实现,但是对于varchar/clob处理方式还不一样,下面是我实现识别字符串中的中文或是识别字母或是识别数字需求的方式。
一、全角半角转换函数(to_multi_byte/to_single_byte)官方介绍
开始之前先上两个函数在官方得详细说明,因为之前开发函数之前没去官方看说明,直接百度看说明导致得坑,下面是官方文档得截图,官方明确说明这两个函数不支持clob字段处理,但是我们可以通过隐式转换来实现,后面给出了本人写的对正常情况下varchar类型的支持以及相同功能下clob字段的实现方法:
Purpose
TO_SINGLE_BYTE returns char with all of its multibyte characters
converted to their corresponding single-byte characters. char can be
of data type CHAR, VARCHAR2, NCHAR, or NVARCHAR2. The value returned
is in the same data type as char.Any multibyte characters in char that have no single-byte equivalents
appear in the output as multibyte characters. This function is useful
only if your database character set contains both single-byte and
multibyte characters.This function does not support CLOB data directly. However, CLOBs can
be passed in as arguments through implicit data conversion.See Also:
“Data Type Comparison Rules” for more information.
Appendix C in Oracle Database Globalization Support Guide for the
collation derivation rules, which define the collation assigned to the
character return value of TO_SINGLE_BYTE
PurposeTO_MULTI_BYTE returns char with all of its single-byte characters
converted to their corresponding multibyte characters. char can be of
data type CHAR, VARCHAR2, NCHAR, or NVARCHAR2. The value returned is
in the same data type as char.Any single-byte characters in char that have no multibyte equivalents
appear in the output string as single-byte characters. This function
is useful only if your database character set contains both
single-byte and multibyte characters.This function does not support CLOB data directly. However, CLOBs can
be passed in as arguments through implicit data conversion.See Also:
“Data Type Comparison Rules” for more information.
Appendix C in Oracle Database Globalization Support Guide for the
collation derivation rules, which define the collation assigned to the
character return value of TO_MULTI_BYTE
二、识别字符串中的中文、英文、数字
测试数据如下:
------测试数据:
create table test_tab(id number,name varchar2(1000),loc clob);
insert into test_tab values(1,'','');
insert into test_tab values(2,' ,.·',' .,·');
insert into test_tab values(3,' '||chr(10)||chr(13),' '||chr(10)||chr(13));
insert into test_tab values(4,'sdf 东方闪电sdf123123 .·','sdf 东方闪电sdf123123 .·');
insert into test_tab values(5,'123','123');
insert into test_tab values(6,' 1 2 3',' 1 2 3');
insert into test_tab values(7,'1 2 3','1 2 3');
insert into test_tab values(8,'1 2 3ADSDSasdsASDAS','1 2 3ASDasdsASDAS');
insert into test_tab values(10,' 1 2 3ADSDSasdsASDAS',' 1 2 3ASDasdsASDAS');
insert into test_tab values(11,'sdf','sdf');
commit;
处理varchar类型数据
create or replace function trans_varchar(p_char varchar2) return varchar2 DETERMINISTIC is
v_cache varchar2(4000);
v_char varchar2(4000):=to_single_byte(p_char);
cursor col is
select regexp_substr(v_char, '[^[:digit:],^[:punct:],^[:cntrl:],^[:blank:]]', 1, level) as ch,
regexp_substr(v_char, '[[:digit:]]', 1, level) as nu
from dual
connect by level <= case when length(p_char)>15 then 15 else length(p_char) end;
begin
v_cache := '';
for i in col loop
case
when asciistr(i.ch) between '\4E00' and '\9FA5' then
v_cache := v_cache || i.ch;
/* when i.ch between 0 and 9 then
v_cache:=v_cache||i.ch;*/
when i.ch between 'a' and 'z' then
v_cache := v_cache || i.ch;
when i.ch between 'A' and 'Z' then
v_cache := v_cache || i.ch;
else
v_cache := v_cache || i.nu;
end case;
end loop;
if v_cache is null then
return null;
else
return rtrim(ltrim(rtrim(ltrim(rtrim(ltrim(trim(v_char),
chr(9)),
chr(9)),
chr(10)),
chr(10)),
chr(13)),
chr(13));
end if;
end trans_varchar;
/
处理clob类型数据
create or replace function trans_clob(p_char clob) return clob DETERMINISTIC is
v_cache varchar2(4000);
v_char clob:=to_single_byte(dbms_lob.substr(p_char));
cursor col is
select regexp_substr(v_char, '[^[:digit:],^[:punct:],^[:cntrl:],^[:blank:]]', 1, level) as ch,
regexp_substr(v_char, '[[:digit:]]', 1, level) as nu
from dual
connect by level <= 15;
begin
v_cache := '';
for i in col loop
case
when asciistr(i.ch) between '\4E00' and '\9FA5' then
v_cache := v_cache || i.ch;
/* when i.ch between 0 and 9 then
v_cache:=v_cache||i.ch;*/
when i.ch between 'a' and 'z' then
v_cache := v_cache || i.ch;
when i.ch between 'A' and 'Z' then
v_cache := v_cache || i.ch;
else
v_cache := v_cache || i.nu;
end case;
end loop;
if v_cache is null then
return null;
else
return rtrim(ltrim(rtrim(ltrim(rtrim(ltrim(trim(v_char),
chr(9)),
chr(9)),
chr(10)),
chr(10)),
chr(13)),
chr(13));
end if;
end trans_clob;
/
相关文章
- 探索Oracle在信息管理中的应用(oracle的应用)
- 利用Oracle技术实现文本导入表(oracle文本导入表)
- 示例使用Oracle序列实现自增主键:案例示范(oracle序列使用)
- 函数精通 Oracle 自定义函数的技巧(oracle自定义)
- Oracle 列自增功能指南(oracle列自增)
- Oracle数据库中触发器的类型及其应用(oracle触发器类型)
- 深入浅出Oracle字符串中文处理(oracle字符串中文)
- 『成都引领:Oracle聘请新人』(成都oracle招聘)
- 如何在Oracle中查看绑定变量(oracle查看绑定变量)
- 轻松获取信息——如何使用Oracle查看数据字典。(oracle查看数据字典)
- 如何使用Oracle创建表(oracle怎样创建表)
- 使用Oracle回滚语句来增强可靠性(oracle回滚语句)
- Oracle中文判断:最佳实践(oracle判断中文)
- 【Oracle查询:妙用中文条件】(oracle中文条件)
- Oracle公司中文名字为甲骨文(oracle公司中文名字)
- C结合Oracle数据库实现案例(c# oracle 例子)
- 解决Oracle中文乱码问题(c oracle中文乱码)
- 级数据Oracle中精确到毫秒的数据获取方法(oracle中获取毫秒)
- Oracle用户人数榜单谁在领跑(oracle人数列名)
- Oracle 实现实时应用的示范(oracle 使用案例)
- Oracle中日期数据类型的比较分析(oracle中日期的比较)
- Oracle中文音排序 突破挑战把握发展单位(oracle 中文音排序)
- Oracle中文拼音转换之旅(oracle中文转成拼音)
- Oracle中文读音从拼音到完美发音(oracle中文读法)
- 学习Oracle挑战中文试题(oracle中文试题)
- 探索Oracle中文数据的未知之谜(oracle中文数据问号)
- 深入探究Oracle中文报错乱码问题(oracle中文报错乱码)
- 解决Oracle中文向导乱码问题(oracle中文向导乱码)
- 解决Oracle中文乱码问题(oracle中文出不来)
- Oracle中计算周数的方法(oracle中如何算周数)
- 解决 Oracle 无法识别中文问题(oracle 不识别中文)
- Oracle却依然不购买旗舰数据库产品(oracle不买数据库)
- 学习 Oracle SQL实战案例分享(oracle sql案例)
- Oracle Quto开启一场智慧之旅(oracle quto)
- 固定 Job 间隔调度Oracle 案例实践(oracle job间隔)