Skip to content

HDDS-15600. Fix ListObjects response for encoding-type, empty delimiter, and control-character prefix#10586

Open
Gargi-jais11 wants to merge 1 commit into
apache:masterfrom
Gargi-jais11:HDDS-15600
Open

HDDS-15600. Fix ListObjects response for encoding-type, empty delimiter, and control-character prefix#10586
Gargi-jais11 wants to merge 1 commit into
apache:masterfrom
Gargi-jais11:HDDS-15600

Conversation

@Gargi-jais11

@Gargi-jais11 Gargi-jais11 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Ozone S3 Gateway fails s3-tests ListObjects cases that AWS S3 passes:

  • test_bucket_list_encoding_basic — With EncodingType=url, keys and CommonPrefixes containing spaces are encoded with + (Java URLEncoder form encoding) instead of AWS-style %20.
    Example: prefix quux ab/ is returned as quux+ab/ but should be quux%20ab/.
  • test_bucket_list_delimiter_empty — When Delimiter='' is sent, listing behavior is correct (all keys returned, no CommonPrefixes), but the response incorrectly includes a Delimiter field. AWS omits Delimiter from the XML when the client passes an empty delimiter.
  • test_bucket_list_prefix_unreadable — ListObjects with Prefix='\x0a' (newline) should echo the prefix in the response and return empty Contents/CommonPrefixes. Ozone may not preserve or echo the control-character prefix correctly.

https://ozone.s3.peterxcli.dev/#latest-run-section

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-15600

How was this patch tested?

Before fix:

  1. encoding type:
bash-5.1$ aws s3api list-objects   --bucket buck1   --encoding-type url   --delimiter /   --endpoint-url http://s3g:9878/
{
    "Contents": [
        {
            "Key": "asdf%2Bb",
            "LastModified": "2026-06-19T09:51:39.569Z",
            "ETag": "\"f75b8179e4bbe7e2b4a074dcef62de95\"",
            "Size": 8,
            "StorageClass": "STANDARD",
            "Owner": {
                "DisplayName": "testuser",
                "ID": "bb2bd7ca4a327f84e6cd3979f8fa3828a50a08893c1b68f9d6715352c8d07b93"
            }
        }
    ],
    "CommonPrefixes": [
        {
            "Prefix": "foo/"
        },
        {
            "Prefix": "foo%2B1/"
        },
        {
            "Prefix": "quux+ab/".                       <----------------- wrong output
        }
    ],
    "RequestCharged": null,
    "Prefix": ""
}
  1. empty delimiter:
bash-5.1$ aws --debug s3api list-objects \
  --bucket buck1 \
  --delimiter "" \
  --endpoint-url http://s3g:9878 2>&1 \
  | grep -oE '<Prefix>[^<]*</Prefix>|<Delimiter>[^<]*</Delimiter>|<KeyCount>[^<]*</KeyCount>'
<Prefix></Prefix>
<KeyCount>0</KeyCount>
<Delimiter></Delimiter>.           <---------- should not be present

3 echoed prefix always url-encoded:

bash-5.1$ aws s3api list-objects \
  --bucket buck1 \
  --prefix $'\n' \
  --endpoint-url http://s3g:9878/
{
    "RequestCharged": null,
    "Prefix": "%0A"               <------------------ wrong behaviour
}

After fix:

  1. encoding-type:
bash-5.1$ aws s3api list-objects   --bucket buck1   --encoding-type url   --delimiter /   --endpoint-url http://s3g:9878/
{
    "Contents": [
        {
            "Key": "asdf%2Bb",
            "LastModified": "2026-06-19T09:08:58.933Z",
            "ETag": "\"f75b8179e4bbe7e2b4a074dcef62de95\"",
            "Size": 8,
            "StorageClass": "STANDARD",
            "Owner": {
                "DisplayName": "testuser",
                "ID": "bb2bd7ca4a327f84e6cd3979f8fa3828a50a08893c1b68f9d6715352c8d07b93"
            }
        }
    ],
    "CommonPrefixes": [
        {
            "Prefix": "foo/"
        },
        {
            "Prefix": "foo%2B1/"
        },
        {
            "Prefix": "quux%20ab/"                <------------------------correct output
        }
    ],
    "RequestCharged": null,
    "Prefix": null
}
  1. empty delimeter :
bash-5.1$ aws --debug s3api list-objects \
  --bucket buck1 \
  --delimiter "" \
  --endpoint-url http://s3g:9878 2>&1 \
  | grep -oE '<Prefix>[^<]*</Prefix>|<Delimiter>[^<]*</Delimiter>|<KeyCount>[^<]*</KeyCount>'
  
<KeyCount>0</KeyCount>.                 <-------------------- no empty delimiter present
  1. echoed prefix should not be url encoded:
bash-5.1$ aws s3api list-objects \
  --bucket buck1 \
  --prefix $'\n' \
  --endpoint-url http://s3g:9878/
{
    "RequestCharged": null,
    "Prefix": "\n"
}

@adoroszlai adoroszlai added the s3 S3 Gateway label Jun 23, 2026

@adoroszlai adoroszlai left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Gargi-jais11 for the patch.

Comment on lines +66 to +74
* Percent-encode a string for S3 {@code encoding-type=url} responses.
*
* <p>Unlike {@link URLEncoder} (application/x-www-form-urlencoded), AWS S3
* uses percent-encoding where spaces are {@code %20}, not {@code +}.
*/
public static String s3urlEncode(String str)
throws UnsupportedEncodingException {
return urlEncode(str).replace("+", "%20");
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay let me check and get back to u.

@Gargi-jais11 Gargi-jais11 marked this pull request as ready for review June 23, 2026 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

s3 S3 Gateway

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants