0%

处理Python混编接口对字典操作的多线程竞争问题

在使用Python3对字典操作时遇到错误:dictionary changed size during iteration

原始代码如下

1
2
3
4
5
6
def update_data_dict(self) -> (dict, bool):
if self.time_point > self.pre_time_point:
self.pre_time_point = self.time_point
return deepcopy(self.access), False
else:
return deepcopy(self.access), True

其中access是使用C++混编模块回调函数更新数据的缓存变量(字典),在数据量较少(每秒钟更新100~200笔数据)的时候没有发生过问题,在加大数据量到每秒钟3000笔到5000笔数据的时候出现bug

1
dictionary changed size during iteration

Dictionaries implement a tp_iter slot that returns an efficient iterator that iterates over the keys of the dictionary. During such an iteration, the dictionary should not be modified, except that setting the value for an existing key is allowed (deletions or additions are not, nor is the update() method).

可以看到由于多线程共用一个变量没有线程锁导致的问题,需要同步的修改python和c++接口部分的代码,添加线程同步锁,由于涉及到混编和接口这种修改较为复杂并且强化了不同模块代码的耦合。偶然间在网上发现一种另类的解决方法,Python字典在遍历时不能进行修改,但是转换成列表或集合可以。对轻度使用的函数可以将dict修改为list避免工作量。修改后的代码如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def update_data_dict(self) -> (dict, bool):
"""
Python3对字典操作时遇到错误:dictionary changed size during iteration
https://blog.csdn.net/zhihaoma/article/details/51265168
:return:
"""
access = dict()
for key in list(self.access.keys()):
access[key] = self.access[key]

if self.time_point > self.pre_time_point:
self.pre_time_point = self.time_point
return access, False
else:
return access, True

或者

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def update_data_dict(self) -> (dict, bool):
"""
Python3对字典操作时遇到错误:dictionary changed size during iteration
https://blog.csdn.net/zhihaoma/article/details/51265168
:return:
"""
access = dict()
for key in self.access:
access[key] = self.access[key]

if self.time_point > self.pre_time_point:
self.pre_time_point = self.time_point
return access, False
else:
return access, True