programing

파이썬에서 반복기를 청크(n개)별로 반복합니까?

lastmoon 2023. 7. 16. 17:45

파이썬에서 반복기를 청크(n개)별로 반복합니까?

반복기를 주어진 크기의 덩어리로 분할할 수 있는 좋은 방법을 생각해 낼 수 있습니까?

그므로러.l=[1,2,3,4,5,6,7]와 함께chunks(l,3) 반가됨자복[1,2,3], [4,5,6], [7]

그렇게 하기 위한 작은 프로그램은 생각할 수 있지만, 아마도 그것의 도구들로는 좋은 방법이 아닙니다.

그grouper()의요법에서 나온 itertools설명서의 레시피는 다음과 같습니다.

def grouper(iterable, n, *, incomplete='fill', fillvalue=None):
    "Collect data into non-overlapping fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, fillvalue='x') --> ABC DEF Gxx
    # grouper('ABCDEFG', 3, incomplete='strict') --> ABC DEF ValueError
    # grouper('ABCDEFG', 3, incomplete='ignore') --> ABC DEF
    args = [iter(iterable)] * n
    if incomplete == 'fill':
        return zip_longest(*args, fillvalue=fillvalue)
    if incomplete == 'strict':
        return zip(*args, strict=True)
    if incomplete == 'ignore':
        return zip(*args)
    else:
        raise ValueError('Expected fill, strict, or ignore')

하지만 마지막 청크가 완료되지 않은 경우에는 잘 작동하지 않습니다.incomplete모드에서 마지막 청크를 채우기 값으로 채우거나 예외를 올리거나 완료되지 않은 청크를 자동으로 삭제합니다.

더 최근 버전의 레시피에서 그들은 추가했습니다.batched당신이 원하는 것을 정확히 하는 레시피:

def batched(iterable, n):
    "Batch data into tuples of length n. The last batch may be shorter."
    # batched('ABCDEFG', 3) --> ABC DEF G
    if n < 1:
        raise ValueError('n must be at least one')
    it = iter(iterable)
    while (batch := tuple(islice(it, n))):
        yield batch

마지막으로, 시퀀스에서만 작동하지만 마지막 청크를 원하는 대로 처리하고 원래 시퀀스의 유형을 유지하는 덜 일반적인 솔루션은 다음과 같습니다.

(my_list[i:i + chunk_size] for i in range(0, len(my_list), chunk_size))

OP가 청크를 리스트나 튜플로 반환하도록 함수에 요청하지만, 당신이 반복기를 반환해야 할 경우, 스벤 마르나흐의 솔루션은 다음과 같이 수정될 수 있습니다.

def batched_it(iterable, n):
    "Batch data into iterators of length n. The last batch may be shorter."
    # batched('ABCDEFG', 3) --> ABC DEF G
    if n < 1:
        raise ValueError('n must be at least one')
    it = iter(iterable)
    while True:
        chunk_it = itertools.islice(it, n)
        try:
            first_el = next(chunk_it)
        except StopIteration:
            return
        yield itertools.chain((first_el,), chunk_it)

일부 벤치마크: http://pastebin.com/YkKFvm8b

기능이 모든 청크의 요소를 통해 반복되는 경우에만 약간 더 효율적입니다.

3로는 python 3.8을 .:=연산자:

def grouper(iterator: Iterator, n: int) -> Iterator[list]:
    while chunk := list(itertools.islice(iterator, n)):
        yield chunk

그런 식으로 부르죠

>>> list(grouper(iter('ABCDEFG'), 3))
[['A', 'B', 'C'], ['D', 'E', 'F'], ['G']]

참고: 입력할 수 있습니다.iter에 시대에grouper를 기능Iterable신에대 Iterator.

이것은 모든 반복 가능한 것에 적용됩니다.완전한 유연성을 위해 제너레이터를 반환합니다.저는 이제 기본적으로 @recloseddevs 솔루션과 동일하지만 솜털이 없다는 것을 깨달았습니다. 필요 try...except▁the로서StopIteration그것이 우리가 원하는 것입니다.

그next(iterable)통화는 그것을 올리기 위해 필요합니다.StopIteration편집할 수 있는 것이 비어 있을 때부터.islice허용하면 빈 생성기가 계속 생성됩니다.

두 줄밖에 안 되는데 이해하기 쉬워서 더 좋습니다.

def grouper(iterable, n):
    while True:
        yield itertools.chain((next(iterable),), itertools.islice(iterable, n-1))

:next(iterable)튜플에 넣어집니다.그렇지 않으면, 만약next(iterable)그 자체는 참을 만했습니다, 그렇다면.itertools.chain그것을 평평하게 만들 것입니다.이 문제를 지적해준 제레미 브라운에게 감사드립니다.

Python 3.12는 목록을 포함한 모든 반복 가능 항목에서 작동하는 iter tools.batched를 추가합니다.

>>> from itertools import batched
>>> for batch in batched('ABCDEFG', 3):
...     print(batch)
('A', 'B', 'C')
('D', 'E', 'F')
('G',)

저는 오늘 어떤 일을 하고 있었는데 제가 생각하는 간단한 해결책을 생각해냈습니다.jsbueno의 대답과 비슷하지만, 나는 그의 대답이 공허할 것이라고 믿습니다.group가 이가다 같때을과음일 때iterable는 로나 수있는으로 수 있습니다.n내 대답은 간단한 확인을 합니다.iterable지쳤습니다.

def chunk(iterable, chunk_size):
    """Generates lists of `chunk_size` elements from `iterable`.
    
    
    >>> list(chunk((2, 3, 5, 7), 3))
    [[2, 3, 5], [7]]
    >>> list(chunk((2, 3, 5, 7), 2))
    [[2, 3], [5, 7]]
    """
    iterable = iter(iterable)
    while True:
        chunk = []
        try:
            for _ in range(chunk_size):
                chunk.append(next(iterable))
            yield chunk
        except StopIteration:
            if chunk:
                yield chunk
            break

청크를 합니다.map(list, chunks(...))목록을 원한다면요.

from itertools import islice, chain
from collections import deque

def chunks(items, n):
    items = iter(items)
    for first in items:
        chunk = chain((first,), islice(items, n-1))
        yield chunk
        deque(chunk, 0)

if __name__ == "__main__":
    for chunk in map(list, chunks(range(10), 3)):
        print chunk

    for i, chunk in enumerate(chunks(range(10), 3)):
        if i % 2 == 1:
            print "chunk #%d: %s" % (i, list(chunk))
        else:
            print "skipping #%d" % i

간단한 구현은 다음과 같습니다.

chunker = lambda iterable, n: (ifilterfalse(lambda x: x == (), chunk) for chunk in (izip_longest(*[iter(iterable)]*n, fillvalue=())))

이는 다음과 같은 이유로 작동합니다.[iter(iterable)]*n동일한 반복 횟수를 포함하는 목록입니다. 목록의 각 반복 횟수에서 하나의 항목을 가져오는 항목을 건너뜁니다. 이는 동일한 반복 횟수이며, 각 zip-timeout은 다음의 그룹을 포함합니다.n

izip_longest는 첫 했을 때 되는 반복이 필요합니다. 는 첫번 고 도 반 된 것 는 중 하 아 필 다 합 니 요 위 해 소 기 하 비 히 완 전 를 이 단 반 라 째 기 니 기 갈 본 복 때 에 할 기복 달▁from▁is▁isator▁remainder▁다▁any▁iter니ops▁needed합▁off▁fully▁stopping▁when요첫필,▁to ▁consume▁ch▁the해위,▁which째lying▁reached▁it번 이는 나머지 부분을 잘라냅니다.iterable따라서 채우기 값을 필터링해야 합니다.따라서 조금 더 강력한 구현은 다음과 같습니다.

def chunker(iterable, n):
    class Filler(object): pass
    return (ifilterfalse(lambda x: x is Filler, chunk) for chunk in (izip_longest(*[iter(iterable)]*n, fillvalue=Filler)))

이렇게 하면 채우기 값이 기본 편집 가능한 항목이 아닙니다.위의 정의 사용:

iterable = range(1,11)

map(tuple,chunker(iterable, 3))
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (10,)]

map(tuple,chunker(iterable, 2))
[(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]

map(tuple,chunker(iterable, 4))
[(1, 2, 3, 4), (5, 6, 7, 8), (9, 10)]

이 구현은 사용자가 원하는 것을 거의 수행하지만 다음과 같은 문제가 있습니다.

def chunks(it, step):
  start = 0
  while True:
    end = start+step
    yield islice(it, start, end)
    start = end

은 (으) 입니다.islice에서는 "StopIteration"의 "" "StopIteration"을 발생시키지 .it이것은 영원히 양보할 것입니다; 또한 약간 까다로운 문제가 있습니다.islice이 생성기를 반복하기 전에 결과를 사용해야 합니다.

이동 창을 기능적으로 생성하려면 다음과 같이 하십시오.

izip(count(0, step), count(step, step))

그러면 다음과 같습니다.

(it[start:end] for (start,end) in izip(count(0, step), count(step, step)))

그러나, 그것은 여전히 무한 반복기를 만듭니다.따라서 이를 제한하려면 시간이 필요합니다(또는 다른 방법이 더 나을 수도 있습니다).

chunk = lambda it, step: takewhile((lambda x: len(x) > 0), (it[start:end] for (start,end) in izip(count(0, step), count(step, step))))

g = chunk(range(1,11), 3)

tuple(g)
([1, 2, 3], [4, 5, 6], [7, 8, 9], [10])

"복잡한 것보다 더 간단한 것이 낫다" - 몇 줄 길이의 간단한 생성기가 그 일을 할 수 있습니다.일부 유틸리티 모듈 또는 기타 모듈에 설치하기만 하면 됩니다.

def grouper (iterable, n):
    iterable = iter(iterable)
    count = 0
    group = []
    while True:
        try:
            group.append(next(iterable))
            count += 1
            if count % n == 0:
                yield group
                group = []
        except StopIteration:
            yield group
            break

저는 이것에 대한 영감을 어디서 찾았는지 잊어버립니다.Windows 레지스트리에서 MSI GUID와 함께 작동하도록 약간 수정했습니다.

def nslice(s, n, truncate=False, reverse=False):
    """Splits s into n-sized chunks, optionally reversing the chunks."""
    assert n > 0
    while len(s) >= n:
        if reverse: yield s[:n][::-1]
        else: yield s[:n]
        s = s[n:]
    if len(s) and not truncate:
        yield s

reverse질문에는 해당되지 않지만, 이 기능으로 광범위하게 사용하고 있습니다.

>>> [i for i in nslice([1,2,3,4,5,6,7], 3)]
[[1, 2, 3], [4, 5, 6], [7]]
>>> [i for i in nslice([1,2,3,4,5,6,7], 3, truncate=True)]
[[1, 2, 3], [4, 5, 6]]
>>> [i for i in nslice([1,2,3,4,5,6,7], 3, truncate=True, reverse=True)]
[[3, 2, 1], [6, 5, 4]]

여기 있어요.

def chunksiter(l, chunks):
    i,j,n = 0,0,0
    rl = []
    while n < len(l)/chunks:        
        rl.append(l[i:j+chunks])        
        i+=chunks
        j+=j+chunks        
        n+=1
    return iter(rl)


def chunksiter2(l, chunks):
    i,j,n = 0,0,0
    while n < len(l)/chunks:        
        yield l[i:j+chunks]
        i+=chunks
        j+=j+chunks        
        n+=1

예:

for l in chunksiter([1,2,3,4,5,6,7,8],3):
    print(l)

[1, 2, 3]
[4, 5, 6]
[7, 8]

for l in chunksiter2([1,2,3,4,5,6,7,8],3):
    print(l)

[1, 2, 3]
[4, 5, 6]
[7, 8]


for l in chunksiter2([1,2,3,4,5,6,7,8],5):
    print(l)

[1, 2, 3, 4, 5]
[6, 7, 8]

폐쇄형 개발사의 답변에 대한 몇 가지 개선 사항은 다음과 같습니다.

번째 의 풀링을 이고 덜 합니다.next을 불러들입니다.try/except StopIteration:블록
요소: 청크 의 내부 )를 break특정 조건 하에서); leclosedev의 솔루션에서는 첫 번째 요소(확실히 소비됨)를 제외하고 다른 "슬립된" 요소는 실제로 건너뛸 수 없습니다(다음 청크의 초기 요소가 될 뿐이므로 더 이상 데이터를 끌어내지 않습니다).n-정렬 오프셋 및 호출자의 경우break청크를 통해 루프하므로 나머지 요소는 필요하지 않더라도 수동으로 사용해야 합니다.)

이 두 가지 수정 사항을 결합하면 다음과 같은 이점을 얻을 수 있습니다.

import collections  # At top of file
from itertools import chain, islice  # At top of file, denamespaced for slight speed boost

# Pre-create a utility "function" that silently consumes and discards all remaining elements in
# an iterator. This is the fastest way to do so on CPython (deque has a specialized mode
# for maxlen=0 that pulls and discards faster than Python level code can, and by precreating
# the deque and prebinding the extend method, you don't even need to create new deques each time)
_consume = collections.deque(maxlen=0).extend

def batched_it(iterable, n):
    "Batch data into sub-iterators of length n. The last batch may be shorter."
    # batched_it('ABCDEFG', 3) --> ABC DEF G
    if n < 1:
        raise ValueError('n must be at least one')
    n -= 1  # First element pulled for us, pre-decrement n so we don't redo it every loop
    it = iter(iterable)
    for first_el in it:
        chunk_it = islice(it, n)
        try:
            yield chain((first_el,), chunk_it)
        finally:
            _consume(chunk_it)  # Efficiently consume any elements caller didn't consume

온라인으로 해보세요!

코드 골프 버전:

def grouper(iterable, n):
    for i in range(0, len(iterable), n):
        yield iterable[i:i+n]

용도:

>>> list(grouper('ABCDEFG', 3))
['ABC', 'DEF', 'G']

는 이함 다같음반이필없 다 습 니 요 가 복 Sized따라서 반복자도 수용할 수 있습니다.무한 반복성을 지원하며 크기가 1보다 작은 청크를 선택하면 오류가 발생합니다(크기 == 1을 지정하는 것은 사실상 쓸모가 없음).

주석은 이고, 론유주선사항며이택석은형물,며▁the▁annot물ations이▁are사항▁of,/에서 (매변이서에수만(이드는것개이)(만는서)를 .iterable원하는 경우 제거할 수 있습니다.

T = TypeVar("T")


def chunk(iterable: Iterable[T], /, size: int) -> Generator[list[T], None, None]:
    """Yield chunks of a given size from an iterable."""
    if size < 1:
        raise ValueError("Cannot make chunks smaller than 1 item.")

    def chunker():
        current_chunk = []
        for item in iterable:
            current_chunk.append(item)

            if len(current_chunk) == size:
                yield current_chunk

                current_chunk = []

        if current_chunk:
            yield current_chunk

    # Chunker generator is returned instead of yielding directly so that the size check
    #  can raise immediately instead of waiting for the first next() call.
    return chunker()

재귀 솔루션:

def batched(i: Iterable, split: int) -> Tuple[Iterable, ...]:
    if chunk := i[:split]:
        yield chunk
        yield from batched(i[split:], split)

여기 간단한 것이 있습니다.

n=2
l = list(range(15))
[l[i:i+n] for i in range(len(l)) if i%n==0]
Out[10]: [[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, 11], [12, 13], [14]]

i 범위(len(l)):이 파트는 범위() 함수와 len(l)을 상한으로 사용하여 l의 인덱스에 대한 반복을 지정합니다.

if i % n == 0: 이 조건은 새 목록의 요소를 필터링합니다.i %n은 현재 인덱스 i가 나머지 없이 n으로 나뉠 수 있는지 확인합니다.이 경우 해당 인덱스의 요소가 새 목록에 포함되고, 그렇지 않으면 건너뜁니다.

l[i:i+n]:이 부분은 l에서 하위 목록을 추출합니다.슬라이스 표기법을 사용하여 i에서 i+n-1까지의 인덱스 범위를 지정합니다.따라서 i % n == 0 조건을 충족하는 각 인덱스 i에 대해 해당 인덱스에서 시작하여 길이 n의 하위 목록이 생성됩니다.

언급URL : https://stackoverflow.com/questions/8991506/iterate-an-iterator-by-chunks-of-n-in-python

'programing' 카테고리의 다른 글

matplotlib에서 색상 막대 범위 설정 (0)	2023.07.16
__init__()가 부모 클래스의 __init_()을 호출해야 합니까? (0)	2023.07.16
python-dotenv의 용도는 무엇입니까? (0)	2023.07.16
Python에서 인스턴스 변수를 가져오는 방법은 무엇입니까? (0)	2023.07.16
Python에서 콘솔 출력 바꾸기 (0)	2023.07.16

현재글파이썬에서 반복기를 청크(n개)별로 반복합니까?

각종 프로그래밍 정보를 다루는 블로그입니다.

CSS, Excel, json, Python, sql-server, git, WordPress, ajax, Oracle, Android, ReactJS, C, jquery, PowerShell, angularjs, spring-boot, spring, asp.net, MariaDB, mongodb,

Today :
Yesterday :

lastmoon

파이썬에서 반복기를 청크(n개)별로 반복합니까?

파이썬에서 반복기를 청크(n개)별로 반복합니까?

예:

'programing' 카테고리의 다른 글

'programing'의 다른글

티스토리툴바

« 2024/12 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

파이썬에서 반복기를 청크(n개)별로 반복합니까?

파이썬에서 반복기를 청크(n개)별로 반복합니까?

예:

'programing' 카테고리의 다른 글

'programing'의 다른글

관련글

티스토리툴바