Skip to content

Implement new array function array_column() #56

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from
Closed

Implement new array function array_column() #56

wants to merge 12 commits into from

Conversation

ramsey
Copy link
Member

@ramsey ramsey commented Apr 14, 2012

array_column() returns the values of the specified column from a multi-dimensional array.

I've actually had this patch sitting around for several years, and I decided to take a look, clean it up, and finally submit it. Inspired by database methods like PDOStatement::fetchColumn(), I wrote this to scratch an itch, since I've often had the need to take a recordset array and pull a single column of values from it, which normally involves looping over the array and creating a new array of the values for the column I want to pull. I decided to implement this in the PHP core to save that step.

You can take a look at the test included to see how it works. The second parameter is the key for the column to return, which can be a number for numeric indexes or a string for associative arrays. If it can't find the column specified, it simply returns an empty array.

break;
default:
php_error_docref(NULL TSRMLS_CC, E_WARNING, "The key should be either a string or an integer");
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe RETURN_FALSE ? Though then I'd move the array init after the check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this to RETURN_FALSE. I've also cleaned it up a bit by removing some of the other cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IS_OBJECT should be there too, calling convert_to_string if necessary (will deal with the existence of __toString or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert_to_string_ex afair :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added IS_OBJECT.

@cataphract
Copy link
Contributor

I'd rather have a more general mechanism for extracting subsets of arrays, like in Mathematica:

In[6]:= {{1, 2}, {3, 4}, {5, 6, 7}}[[All, 2;;]]
Out[6]= {{2}, {4}, {6, 7}}

In[7]:= {{1, 2}, {3, 4}, {5, 6, 7}}[[All, 2]]
Out[7]= {2, 4, 6}

In[8]:= {{1, 2}, {3, 4}, {5, 6, 7}}[[{1,3}, 2]]
Out[8]= {2, 6}

but I guess this is better than nothing.

@ramsey
Copy link
Member Author

ramsey commented Apr 18, 2012

@cataphract, can you suggest how that might look like PHP? I don't think my implementation is too far off. With some slight modifications, it might be possible to achieve something similar to what you're asking.

* PHP-5.3:
  Fix bug 61746 Failing tests in ext/standard/tests/file/windows_links/*
  Fix bug 61720 ext\libxml\tests\bug61367-read.phpt fails
  Fix bug 61719 ext\soap\tests\bugs\bug31422.phpt fails
  Fix bug 61718 ext\ldap\tests\ldap_set_rebind_proc_error.phpt fails
  Fix bug 61717 ext\ldap\tests\ldap_sasl_bind_basic.phpt fails
  Fix bug 61716 tests\basic\021.phpt fails
  Fix bug 61683 tests\basic\bug20539.phpt fails
  Fix bug 61680 ext\zlib\tests\gzencode_variation1-win32.phpt fails
  Fix bug 61676 ext\tidy\tests\bug54682.phpt fails
  Fix bug 61743 tests in ext\standard\tests\file\windows_acls\* fail
@pierrejoye
Copy link
Contributor

@ramsey

btw, pls put a RFC together and send a notification mail to internals once you are ready, referring to this discussion too to avoid to have to discuss everything again from scratch :)

@cataphract
Copy link
Contributor

@ramsey You could for instance use null mean both All and to specify a range. This can be sorted out later. For instance, for my examples above:

array_part($foo, [null, [2, null]]);
array_part($foo, [null, 2]);
array_part($foo, [[1, 3], 2]);

@smalyshev
Copy link
Contributor

I think we shouldn't try to boil the ocean here. array_column is very useful on its own. If we will need super-method to do everything and make fresh coffee, we could add it later, but that IMHO should not hold adding array_column.

@dsp
Copy link
Member

dsp commented Jun 6, 2012

Yes I agree. I think array_column is good enough to be added. I think we are still waiting for an RFC.

ramsey added 2 commits June 21, 2012 12:01
* PHP-5.3: (160 commits)
  Remove outdated and user-specific files
  Add NEWS for bug #62262
  Fixed bug RecursiveArrayIterator does not implement Countable
  sync zip ext version with pecl
  one more correction for COM upgrading notes
  split gzgetc_basic.phpt for zlib 1.2.7
  com ext upgrading correction
  com ext upgrading info
  Reverted the BC fix regarding to #57905, test adopted
  Merge PHP 5.3.14 NEWS
  re-add 61755 to NEWS
  Make travis silent
  Adding a test for ext/posix/tests/posix_getegid_basic.phpt
  typo
  improve overflow checks
  fix potential overflow in _php_stream_scandir
  set current versions for libzip and zip ext
  updated NEWS
  zip windows fixes
  fixed bc break related to #57905
  ...
@ramsey
Copy link
Member Author

ramsey commented Jun 21, 2012

@ccampbell
Copy link

Hey @ramsey, I think think this proposal is a great idea (I do stuff like this all the time), but how do you feel about extending it slightly.

For example if you start with the following array:

$data = array(
    array('id' => 1, 'name' => 'bob', 'valid' => 1, 'email' => '[email protected]'),
    array('id' => 2, 'name' => 'joe', 'valid' => 1, 'email' => '[email protected]'),
    array('id' => 3, 'name' => 'allison', 'valid' => 1, 'email' => '[email protected]')
);

Then you do array_column($data, 'name'); it would return

array(
    'bob',
    'joe',
    'allison
)

But what if you could do array_column($data, 'id', 'name', 'email') or array_column($data, ['id', 'name', 'email']) and have that return

array(
    'id' => array(
        1,
        2,
        3
    ),
    'name' => array(
        'bob',
        'joe',
        'allison'
    ),
    'email' => array(
        '[email protected]',
        '[email protected]',
        '[email protected]'
    )
)

This allows you to basically achieve the same thing but instead of having to make multiple calls you could get a bunch of arrays back in one iteration.

Not sure if this would really hurt performance in which case maybe it could be a separate method called array_columns, but I think it could be useful.

We actually do this at Vimeo quite a bit. Functions like array_multisort require the data in this format.

@Abeja
Copy link

Abeja commented Jul 13, 2012

Hello @ramsey, in my work I often use a similar function, perhaps it will be interesting to you:

function array_collect( $Items , $IndexKey , $ValueKey ) {
    if ( is_null( $IndexKey ) && is_null( $ValueKey ) || !count( $Items ) ) return $Items;
    $Result = array();
    if ( !is_null( $IndexKey ) ) {
        $Keys = is_array( $IndexKey ) ? array_values( $IndexKey ) : array( $IndexKey );
        $Key = ( is_null( end( $Keys ) ) && !array_pop( $Keys ) ) ? '[]' : '';
        $Code = '$Result[ $Item[ $Keys[ ' . implode( '] ] ][ $Item[ $Keys[ ' , array_keys( $Keys ) ) . '] ] ]' . $Key . ' = is_null( $ValueKey ) ? $Item : $Item[ $ValueKey ];';
        foreach ( $Items as $Item ) eval( $Code );
    } else
        foreach ( $Items as $Item ) 
            $Result[] = is_null( $ValueKey ) ? $Item : $Item[ $ValueKey ]; 
    return $Result;                             
}

Examples:

$records = array(
    array( 'id' => 1 , 'parent-id' => 7 , 'name' => 'Doe'),
    array( 'id' => 2 , 'parent-id' => 3 , 'name' => 'Smith'),
    array( 'id' => 4 , 'parent-id' => 7 , 'name' => 'Jones'),
);

array_collect( $records , array( 'parent-id' , 'id' ) , 'name' )
// => array( 7 => array( 1 => 'Doe' , 4 => 'Jones' ), 3 => array( 2 => 'Smith' ) )

array_collect( $records , array( 'parent-id' , null ) , 'name' )
// => array( 7 => array( 'Doe' , 'Jones' ), 3 => array ('Smith' ) )

array_collect( $records , null , 'name' )
// => array( 'Doe' , 'Smith' , 'Jones' )

array_collect( $records , 'id' , null )
/* => 
    array(
        1 => array( 'id' => 1 , 'parent-id' => 7 , 'name' => 'Doe'),
        2 => array( 'id' => 2 , 'parent-id' => 3 , 'name' => 'Smith'),
        4 => array( 'id' => 4 , 'parent-id' => 7 , 'name' => 'Jones'),
    );
*/

@pierrejoye
Copy link
Contributor

What's the status of this PR? Are we ready to vote on the RFC?

@ramsey
Copy link
Member Author

ramsey commented Sep 5, 2012

I need to make some changes, based on feedback.

@lstrojny
Copy link
Contributor

lstrojny commented Jan 6, 2013

@ramsey could you call a vote on internals on this function?

@ramsey
Copy link
Member Author

ramsey commented Jan 11, 2013

Closing this pull request. It is superseded by pull request #257.

@ramsey ramsey closed this Jan 11, 2013
@Majkl578
Copy link
Contributor

@ramsey: Next time, there is no need for new issue, just rebase your branch and force-push it, history will be updated here automatically.

salathe pushed a commit to salathe/php-src that referenced this pull request Mar 11, 2013
dsp added a commit that referenced this pull request Mar 20, 2013
* pull-request/257:
  array_column: Fix compile-time warnings
  array_column: Removed array_pluck() alias
  array_column: Set array_pluck as an alias for array_column
  array_column: Implement ability to specify an index column
  Cleaning up a memory leak.
  array_column: Adding test for IS_OBJECT and converting object to string
  array_column: Using add_next_index_zval() at nikic's recommendation.
  array_column: Improved tests
  array_column: Cleaning up, as recommended in pull request #56 comments
  Fixing typo in test for array_column()
  Simplify the code and use zend_hash_next_index_insert()
  Adding test for columns not present in all rows for array_column().
  Adding tests for the negative results of array_column()
  Implement new array function array_column()
dsp added a commit that referenced this pull request Mar 20, 2013
* PHP-5.5:
  NEWS for array_column
  array_column: Fix compile-time warnings
  array_column: Removed array_pluck() alias
  array_column: Set array_pluck as an alias for array_column
  array_column: Implement ability to specify an index column
  Cleaning up a memory leak.
  array_column: Adding test for IS_OBJECT and converting object to string
  array_column: Using add_next_index_zval() at nikic's recommendation.
  array_column: Improved tests
  array_column: Cleaning up, as recommended in pull request #56 comments
  Fixing typo in test for array_column()
  Simplify the code and use zend_hash_next_index_insert()
  Adding test for columns not present in all rows for array_column().
  Adding tests for the negative results of array_column()
  Implement new array function array_column()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants