Skip to content

Use strlen(str) instead of sizeof(str)-1 for directly declared strings #8336

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

mvorisek
Copy link
Contributor

@mvorisek mvorisek commented Apr 10, 2022

Modernize & unify code to the prefered variant [1]

regex replace used:

  • from sizeof(\s*\(\s*"[^"\\]*"\s*\))\s*-\s*1(?!\w)
  • to strlen$1

I firstly replaced all *.h and *.c files and then carefully reviewed the rest files.

[1] #8241 (comment)

@mvorisek mvorisek force-pushed the const_strlen branch 2 times, most recently from 3fcb9a2 to 9f8ccaf Compare April 10, 2022 21:41
@mvorisek mvorisek marked this pull request as ready for review April 10, 2022 21:49
@mvorisek mvorisek changed the title Use strlen(str) instead of sizeof(str)-1 Use strlen(str) instead of sizeof(str)-1 for directly declared strings Apr 11, 2022
@TimWolla
Copy link
Member

TimWolla commented Apr 11, 2022

regex replace used:

Tip for this type of replacement: Use Coccinelle.

$ cat test.c
#include <string.h>

struct test {
	int a;
	int b;
}

void foo(char *buf, size_t len);
void bar(size_t bar);

int
main(void) {
	const char *foo = "foo";

	foo("foo", sizeof("foo") - 1);
	foo("foo", sizeof("foo"));
	foo(foo, sizeof(foo));

	bar(sizeof(struct test) - 1);
}
$ cat test.cocci 
@@
constant char[] s;
@@

- (sizeof(s) - 1)
+ strlen(s)
$ spatch -sp_file test.cocci test.c
init_defs_builtins: /usr/bin/../lib/coccinelle/standard.h
HANDLING: test.c
diff = 
--- test.c
+++ /tmp/cocci-output-3782-b347f1-test.c
@@ -12,7 +12,7 @@ int
 main(void) {
 	const char *foo = "foo";
 
-	foo("foo", sizeof("foo") - 1);
+	foo("foo", strlen("foo"));
 	foo("foo", sizeof("foo"));
 	foo(foo, sizeof(foo));

@mvorisek
Copy link
Contributor Author

- (sizeof(s) - 1)
+ strlen(s)

@TimWolla this PR aims to replace only the "directly declared strings" (like "string") and only if not used within case and {} array initialization (old compilers does not support strlen there), if you know a good tool to define such replaces based on C AST, please let me know and we can add it to #8295 to assert the code quality

@iluuu1994
Copy link
Member

I'm a bit worried about merge conflicts when merging upstream. We never removed the {{{ comments for the same reason. Although the impact there was surely bigger.

@TimWolla
Copy link
Member

@mvorisek Coccinelle 😃. The patch gets a little more complicated, though.

$ cat test.c; echo; echo
#include <stdio.h>
#include <string.h>

struct test {
	int a;
	int b;
};

void foo(const char *buf, size_t len) { }
void bar(size_t bar) { }

int
main(void) {
	const char *f = "foo";

	foo("foo", sizeof("foo") - 1);
	foo("foo", sizeof("foo"));
	foo(f, sizeof(f));

	bar(sizeof(struct test) - 1);
}

void
array_init() {
	size_t arr[] = { sizeof("foo") - 1, sizeof("bar") - 1 };
}

void
switch_case(size_t i) {
	switch (i) {
		case 1:
			break;
		case 2:
		case sizeof("foo") - 1:
			printf("%lu\n", sizeof("foo") - 1);
			break;
		case 5:
			break;
	}
}


$ cat test.cocci; echo; echo
@r_case@
position p;
expression e;
@@

switch (...)
{
case sizeof(e)@p - 1: ...
}

@r_array_init@
type T;
identifier arr;
expression e;
position p;
@@

T arr[] = {
...,
sizeof(e)@p - 1,
...
};

@@
constant char[] s;
position p != { r_case.p, r_array_init.p };
@@

- (sizeof(s)@p - 1)
+ strlen(s)


$ spatch -sp_file test.cocci test.c
init_defs_builtins: /usr/bin/../lib/coccinelle/standard.h
HANDLING: test.c
diff = 
--- test.c
+++ /tmp/cocci-output-3942-8f1e09-test.c
@@ -13,7 +13,7 @@ int
 main(void) {
 	const char *f = "foo";
 
-	foo("foo", sizeof("foo") - 1);
+	foo("foo", strlen("foo"));
 	foo("foo", sizeof("foo"));
 	foo(f, sizeof(f));
 
@@ -32,7 +32,7 @@ switch_case(size_t i) {
 			break;
 		case 2:
 		case sizeof("foo") - 1:
-			printf("%lu\n", sizeof("foo") - 1);
+			printf("%lu\n", strlen("foo"));
 			break;
 		case 5:
 			break;

@Girgias
Copy link
Member

Girgias commented Apr 11, 2022

The arg info files are generated files, however the script generating them is not updated.

@bwoebi
Copy link
Member

bwoebi commented Apr 12, 2022

Interestingly MSVC does not optimize strlen("str") to 3 in its -O0 config, while clang and gcc do. I assume that's acceptable on Windows maybe though?

@cmb69
Copy link
Member

cmb69 commented Apr 12, 2022

Interestingly MSVC does not optimize strlen("str") to 3 in its -O0 config, while clang and gcc do. I assume that's acceptable on Windows maybe though?

Yeah, I think this is acceptable; a lot of other important optimizations are not done with /O0. And we do not do any "official" debug builds for Windows anyway (neither CI nor elsewhere).

Comment on lines 7577 to +7578
form[l+sizeof("l")-2]=spec;
form[l+sizeof("l")-1]='\0';
form[l+strlen("l")]='\0';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either keep the sizeof here, or I would imagine using strlen for the line above, but this looks a bit weird and might be done for good reasons

Comment on lines -224 to +228
if (memcmp(cmd, "<:", sizeof("<:")-1) == SUCCESS) {
if (memcmp(cmd, "<:", strlen("<:")) == SUCCESS) {
state->in_code = 1;
return;
} else {
if (memcmp(cmd, ":>", sizeof(":>")-1) == SUCCESS) {
if (memcmp(cmd, ":>", strlen(":>")) == SUCCESS) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: but these usages of SUCCESS should probably be removed in favour of using 0 explicitly.

@derickr
Copy link
Member

derickr commented Apr 12, 2022

This seems like making changes just for the sake of making changes, so closing it.

@derickr derickr closed this Apr 12, 2022
@mvorisek mvorisek deleted the const_strlen branch April 13, 2022 09:31
@mvorisek
Copy link
Contributor Author

I refactored code to the prefered variant which is more readable and shorter. Is it a bad practise? I was planning to add Coccinelle script so the prefered variant is asserted by CI, so even merging issues are easily fixable. Can this PR be reopened?

@mvorisek
Copy link
Contributor Author

mvorisek commented Jun 9, 2023

regex replace used:

Tip for this type of replacement: Use Coccinelle.

@TimWolla do you think Coccinelle can be used to transform spaces indentation to tabs?

https://github.com/php/php-src/blob/master/.github/actions/verify-generated-files/action.yml#L15 can then be used for diff output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants