In this post, we investigate how to efficiently delete certain elements from a Python list. More specifically, you want to delete certain elements, whose indices are given.
def removeA(myList, removalIndices):
if not isinstance(removalIndices, set):
removalIndices = set(removalIndices)
return [v for i, v in enumerate(myList) if not (i in removalIndices)]
This solution moves elements that do not need remove to a new list. Clearly, it would not be quite efficient when the number of removals is much less than the length of the original list. Besides, if the indices of the removals are not stored in a set
but a list
, the conversion from list
to set
would further drop the efficiency.
def removeB(myList, removalIndices):
for removalIndex in removalIndices[::-1]:
del myList[removalIndex]
return myList
This solution removes the elements from the last (the one with the largest index) to the first (the one with the smallest index). This solution assumes that the indices of removals are sorted in an ascending order. Note that, removing from the first one is not a good idea, since each removal will change the indices of the remaining removals. In other words, for removals other than the first one, you have to recalculate the index before removing.
def removeC(myList, removalIndices):
nextRemovalIndex = removalIndices[0]
indexOfRemovalIndices = 1
numOfRemovals = len(removalIndices)
lenOfList = len(myList)
currentIndex = nextRemovalIndex + 1
while currentIndex < lenOfList:
nextIndex = removalIndices[indexOfRemovalIndices] if indexOfRemovalIndices < numOfRemovals else lenOfList
while currentIndex < nextIndex:
myList[nextRemovalIndex] = myList[currentIndex]
nextRemovalIndex += 1
currentIndex += 1
indexOfRemovalIndices += 1
currentIndex += 1
# pay attention
myList[-numOfRemovals:] = []
return myList
This solution first shifts all the removals to the end of the list, and then removes them. Note that, this solution also assumes the indices of removals are sorted in an ascending order. Compare to Solution B, it guarantees that every element that needs move just moves once.
myList
has $10^6$ elements, removing $80\%$)
Random locations
Solution A (with
list
representing removals, i.e.,removalIndices
is alist
): 159.12msSolution A (with
set
representing removals, i.e.,removalIndices
is aset
): 134.69msSolution B: 1.13ms
Solution C: 132.39ms
Remarks: during the measurements, I found that the set
create from different status (e.g., sorted or unsorted) of a list
will have different efficiency. I will investigate this in details in future.
From the beginning
Solution A (with
list
representing removals, i.e.,removalIndices
is alist
): 187.58msSolution A (with
set
representing removals, i.e.,removalIndices
is aset
): 133.30sSolution B: 7403.70ms
Solution C: 221.80ms
From the ending
Solution A (with
list
representing removals, i.e.,removalIndices
is alist
): 133.34msSolution A (with
set
representing removals, i.e.,removalIndices
is aset
): 142.03sSolution B: 0.74ms
Solution C: 121.12ms
myList
has $10^6$ elements, removing $0.2\%$)
Random locations
Solution A (with
list
representing removals, i.e.,removalIndices
is alist
): 134.81msSolution A (with
set
representing removals, i.e.,removalIndices
is aset
): 149.77msSolution B: 216.96ms
Solution C: 159.68ms
From the beginning
Solution A (with
list
representing removals, i.e.,removalIndices
is alist
): 150.39msSolution A (with
set
representing removals, i.e.,removalIndices
is aset
): 153.45msSolution B: 2098.12ms
Solution C: 182.74ms
From the ending
Solution A (with
list
representing removals, i.e.,removalIndices
is alist
): 131.15msSolution A (with
set
representing removals, i.e.,removalIndices
is aset
): 134.47sSolution B: 204.69ms
Solution C: 163.19ms