Skip to content

General, Gmail: BatchRequest for 25 gmail threads is very slow #1573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gshahbazian opened this issue Aug 31, 2020 · 2 comments · Fixed by #1749
Closed

General, Gmail: BatchRequest for 25 gmail threads is very slow #1573

gshahbazian opened this issue Aug 31, 2020 · 2 comments · Fixed by #1749
Assignees
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@gshahbazian
Copy link

gshahbazian commented Aug 31, 2020

Environment details

  1. OS type and version: Android 30
  2. Java version: 8, kotlin 1.4.0
  3. google-api-client version(s): 1.30.10

Steps to reproduce

Use the BatchRequest class to fetch 25 threads from Gmail

Code example

val gmailService: Gmail // Configured Gmail service
val batch = gmailService.batch()

// List of 25 thread ids
val threadIds = listOf("THREAD_ID" ...)

for (threadId in threadIds) {
  batch.queue<Thread, GoogleJsonErrorContainer>(
    gmailService.users().Threads().get("me", threadId).buildHttpRequest(),
    Thread::class.java,
    GoogleJsonErrorContainer::class.java,
    object : BatchCallback<Thread, GoogleJsonErrorContainer> {
        override fun onSuccess(t: Thread, responseHeaders: HttpHeaders?) {
            println("BATCH SUCCESS")
        }
        override fun onFailure(e: GoogleJsonErrorContainer, responseHeaders: HttpHeaders?) {
            println("BATCH ERROR: ${e.error?.message}")
        }
    }
  )
}

batch.execute()

Stack trace

Pausing the debugger while batches are being returned pauses here:

while ((line = readRawLine()) != null && !line.startsWith(boundary)) {

The entire batch takes ~2 minutes to return while the google-api-objectivec-client-for-rest library returns the same set in < 2 seconds.

@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Sep 1, 2020
@gshahbazian
Copy link
Author

gshahbazian commented Sep 2, 2020

I've tracked this down further. The degraded performance occurs when the response is gzipped and the request getContent() returns a GZIPInputStream. Accessing this stream type byte by byte to separate each batched response is extremely slow.

Locally I've added a workaround which reads the entire stream upfront and passes a simple ByteArrayInputStream to BatchUnparsedResponse. Workaround code change: superhuman#1

@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Sep 5, 2020
@chingor13 chingor13 added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Sep 8, 2020
@yoshi-automation yoshi-automation removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Sep 8, 2020
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Jan 9, 2021
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Feb 28, 2021
@Neenu1995 Neenu1995 self-assigned this Mar 2, 2021
@chingor13
Copy link
Collaborator

It looks like we originally did not use a BufferedReader because it does not handle all of HTTP RFC 2616. We also don't want to read the entire response upfront as that could blow our memory usage given a huge response.

We will likely want to add buffering to the current implementation to reduce the number of direct reads from the GZipInputStream while limiting memory usage.

gcf-merge-on-green bot pushed a commit that referenced this issue Mar 18, 2021
If reading byte-by-byte from a GZipInputStream is slow, then wrapping in a BufferedInputStream should minimize this effect by reading in larger, buffered chunks.

Added a test to ensure that parsing a large BatchResponse does not take an exorbitant amount of time.

Fixes #1573  ☕️
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
5 participants
@chingor13 @gshahbazian @Neenu1995 @yoshi-automation and others