想要将如下这个字段按照逗号分离,但是在某些字符串中带有逗号,希望能不要分离字符串中的逗号
6898,"RAAF Williams, Laverton Base","Laverton","Australia",\N,"YLVT",-37.86360168457031,144.74600219726562,18,10,"O","Australia/Hobart","airport","OurAirports"
希望能分离成如下形式
6898
RAAF Williams, Laverton Base
Laverton
Australia
直接使用 Python 中的 split 方法会将"RAAF Williams"与"Laverton Base"分离开,请问有什么办法可以避免
1
qwjhb 2020-02-11 23:15:20 +08:00
exec('a=[6898,"RAAF Williams, Laverton Base","Laverton","Australia","YLVT",-37.86360168457031,144.74600219726562,18,10,"O","Australia/Hobart","airport","OurAirports"]')
|
2
retanoj 2020-02-11 23:23:41 +08:00 via iPhone
好方法,命令执行 /代码执行严重漏洞都你这么写出来的
|
4
wuwukai007 2020-02-11 23:27:34 +08:00 via Android 1
正则表达式,如果单词是有引号的话
|
5
huyinjie OP @wuwukai007 #4 看来只能用 ^(\d+),(.+),(.+),(.+),(.+),(.+),(.+),(.+),(.+),(.+),(.+),(.+),(.+)$ 这种来分离了
|
6
retanoj 2020-02-11 23:39:20 +08:00
|
7
retanoj 2020-02-11 23:41:26 +08:00
不好意思,贴乱了
试试这个 import csv >>> list(csv.reader([your_string])) |
8
yuanhego 2020-02-11 23:52:02 +08:00
|
9
noreply69 2020-02-11 23:56:00 +08:00
import csv
s = '6898,"RAAF Williams, Laverton Base","Laverton","Australia",\\N,"YLVT",-37.86360168457031,144.74600219726562,18,10,"O","Australia/Hobart","airport","OurAirports"' splitted = list(csv.reader([s], delimiter=',', quotechar='"'))[0] print(splitted) |
10
noreply69 2020-02-11 23:56:32 +08:00
```
import csv s = '6898,"RAAF Williams, Laverton Base","Laverton","Australia",\\N,"YLVT",-37.86360168457031,144.74600219726562,18,10,"O","Australia/Hobart","airport","OurAirports"' splitted = list(csv.reader([s], delimiter=',', quotechar='"'))[0] print(splitted) ``` |
12
Akkuman 2020-02-12 00:13:53 +08:00 via Android 1
ast.literal_eval('[6898,"RAAF Williams, Laverton Base","Laverton","Australia","YLVT",-37.86360168457031,144.74600219726562,18,10,"O","Australia/Hobart","airport","OurAirports"]')
|
14
huyinjie OP |
17
levelworm 2020-02-12 08:43:21 +08:00 1
当中那个\N 能去掉吗?不去掉的话好像报错?
|
18
noqwerty 2020-02-12 10:17:17 +08:00 via Android 1
直接整个 csv 文件也可以读进来的
|
19
smallpython 2020-02-12 10:25:05 +08:00 1
s = your_str
shuangyinhao_count = 0 result = [] temp = '' for i in s: if i == '"': shuangyinhao_count += 1 elif i == ',': if shuangyinhao_count == 1: # 当双引号数量为 1 时,继续添加字符而不做处理 temp += i else: result.append(temp) temp = '' else: temp += i if shuangyinhao_count == 2: shuangyinhao_count = 0 result.append(temp) print(result) |
20
araraloren 2020-02-12 11:05:17 +08:00 1
正则分隔
import re str = '6898,"RAAF Williams, Laverton Base","Laverton","Australia",\\N,"YLVT",-37.86360168457031,144.74600219726562,18,10,"O","Australia/Hobart","airport","OurAirports"' pattern = re.compile(r'\"[^\"]+\"\,|[^\"\,]+\,'); print(pattern.findall(str)) |
21
chenstack 2020-02-12 12:53:31 +08:00 1
@qwjhb @retanoj 安全地解析字符串成 Python 对象可以用 ast.literal_eval,遇到运算符会抛出异常
import ast ast.literal_eval('6898,"RAAF Williams, Laverton Base","Laverton","Australia","N","YLVT",-37.86360168457031,144.74600219726562,18,10,"O","Australia/Hobart","airport","OurAirports"') (6898, 'RAAF Williams, Laverton Base', 'Laverton', 'Australia', 'YLVT', -37.86360168457031, 144.74600219726562, 18, 10, 'O', 'Australia/Hobart', 'airport', 'OurAirports') |
23
huyinjie OP @smallpython #19 感谢 这种相当于 C 语言中 getchar 最初想到这个方法就是感觉麻烦些
|
27
larsenlouis 2020-02-14 13:19:16 +08:00
pandas 用条件扩充字段
|