解析 php 中包含点的字符串

Parse string containing dots in php

提问人:gdm 提问时间:3/21/2014 最后编辑:gdm 更新时间:1/22/2022 访问量:3286

问:

我将解析以下字符串:

$str = 'ProceduresCustomer.tipi_id=10&ProceduresCustomer.id=1';                 
parse_str($str,$f);

我希望$f解析为:

array(
    'ProceduresCustomer.tipi_id' => '10',
    'ProceduresCustomer.id' => '1'
)

实际上,回报parse_str

array(
        'ProceduresCustomer_tipi_id' => '10',
        'ProceduresCustomer_id' => '1'
    )

除了编写我自己的函数之外,有没有人知道是否有 php 函数?

PHP 解析

评论

0赞 zajd 3/21/2014
us1.php.net/explode ,就您而言<?php $array = explode('&',$input); ?>
0赞 pbond 3/21/2014
在“=”符号上爆炸并使用每个奇数索引作为值,将每个偶数索引用作键?
0赞 Sal00m 3/21/2014
爆炸?$f = explode("&",$str);
0赞 echolocation 3/21/2014
怎么了?parse_str()
0赞 zajd 3/21/2014
parse_str() 会起作用,但没有必要

答:

13赞 Amal Murali 3/21/2014 #1

来自 PHP 手册

变量名称中的点和空格将转换为下划线。例如,变为 .<input name="a.b" />$_REQUEST["a_b"]

所以,这是不可能的。 将所有句点转换为下划线。如果确实无法避免在查询变量名称中使用句点,则必须编写自定义函数来实现此目的。parse_str()

以下函数(取自此答案)将查询字符串中每个键值对的名称转换为其相应的十六进制形式,然后对其执行然后,它们将恢复到原来的形式。这样,句点就不会被触及:parse_str()

function parse_qs($data)
{
    $data = preg_replace_callback('/(?:^|(?<=&))[^=[]+/', function($match) {
        return bin2hex(urldecode($match[0]));
    }, $data);

    parse_str($data, $values);

    return array_combine(array_map('hex2bin', array_keys($values)), $values);
}

用法示例:

$data = parse_qs($_SERVER['QUERY_STRING']);

评论

3赞 Wesley Murch 3/21/2014
不错的阿马尔。希望有任何括号,&var[]=1&var[]=2
0赞 Amal Murali 3/21/2014
@giuseppe:更新了答案以包含替代解决方案。
3赞 The Blue Dog 3/21/2014 #2

快速'n'脏。

$str = "ProceduresCustomer.tipi_id=10&ProceduresCustomer.id=1";    

function my_func($str){
    $expl = explode("&", $str);
    foreach($expl as $r){
        $tmp = explode("=", $r);
        $out[$tmp[0]] = $tmp[1];
    }
    return $out;
}

var_dump(my_func($str));

array(2) {
    ["ProceduresCustomer.tipi_id"]=> string(2) "10"
    ["ProceduresCustomer.id"]=>string(1) "1"
}

评论

0赞 sandwood 6/6/2023
警告:此解决方案不会像parse_str那样对查询进行 urldecode 编码。在附加到 $out 之前,您可能需要对 $tmp[0] 和 $tmp[1] 进行 urldecode
1赞 svvac 3/21/2014 #3

此快速制作的函数尝试正确分析查询字符串并返回一个数组。

第二个(可选)参数告诉解析器在遇到点时创建一个子数组(这超出了问题范围,但我还是包含了它)。$break_dots

/**
 * parse_name -- Parses a string and returns an array of the key path
 * if the string is malformed, only return the original string as a key
 *
 * $str The string to parse
 * $break_dot Whether or not to break on dots (default: false)
 *
 * Examples :
 *   + parse_name("var[hello][world]") = array("var", "hello", "world")
 *   + parse_name("var[hello[world]]") = array("var[hello[world]]") // Malformed
 *   + parse_name("var.hello.world", true) = array("var", "hello", "world")
 *   + parse_name("var.hello.world") = array("var.hello.world")
 *   + parse_name("var[hello][world") = array("var[hello][world") // Malformed
 */
function parse_name ($str, $break_dot = false) {
    // Output array
    $out = array();
    // Name buffer
    $buf = '';
    // Array counter
    $acount = 0;
    // Whether or not was a closing bracket, in order to avoid empty indexes
    $lastbroke = false;

    // Loop on chars
    foreach (str_split($str) as $c) {
        switch ($c) {
            // Encountering '[' flushes the buffer to $out and increments the
            // array counter
            case '[':
                if ($acount == 0) {
                    if (!$lastbroke) $out[] = $buf;
                    $buf = "";
                    $acount++;
                    $lastbroke = false;
                // In this case, the name is malformed. Return it as-is
                } else return array($str);
                break;

            // Encountering ']' flushes rge buffer to $out and decrements the
            // array counter
            case ']':
                if ($acount == 1) {
                    if (!$lastbroke) $out[] = $buf;
                    $buf = '';
                    $acount--;
                    $lastbroke = true;
                // In this case, the name is malformed. Return it as-is
                } else return array($str);
                break;

            // If $break_dot is set to true, flush the buffer to $out.
            // Otherwise, treat it as a normal char.
            case '.':
                if ($break_dot) {
                    if (!$lastbroke) $out[] = $buf;
                    $buf = '';
                    $lastbroke = false;
                    break;
                }

            // Add every other char to the buffer
            default:
                $buf .= $c;
                $lastbroke = false;
        }
    }

    // If the counter isn't back to 0 then the string is malformed. Return it as-is
    if ($acount > 0) return array($str);

    // Otherwise, flush the buffer to $out and return it.
    if (!$lastbroke) $out[] = $buf;
    return $out;
}

/**
 * decode_qstr -- Take a query string and decode it to an array
 *
 * $str The query string
 * $break_dot Whether or not to break field names on dots (default: false)
 */
function decode_qstr ($str, $break_dots = false) {
    $out = array();

    // '&' is the field separator 
    $a = explode('&', $str);

    // For each field=value pair:
    foreach ($a as $param) {
        // Break on the first equal sign.
        $param = explode('=', $param, 2);

        // Parse the field name
        $key = parse_name($param[0], $break_dots);

        // This piece of code creates the array structure according to th
        // decomposition given by parse_name()
        $array = &$out; // Reference to the last object. Starts to $out
        $append = false; // If an empty key is given, treat it like $array[] = 'value'

        foreach ($key as $k) {
            // If the current ref isn't an array, make it one
            if (!is_array($array)) $array = array();
            // If the current key is empty, break the loop and append to current ref
            if (empty($k)) {
                $append = true;
                break;
            }
            // If the key isn't set, set it :)
            if (!isset($array[$k])) $array[$k] = NULL;

            // In order to walk down the array, we need to first save the ref in
            // $array to $tmp
            $tmp = &$array;
            // Deletes the ref from $array
            unset($array);
            // Create a new ref to the next item
            $array =& $tmp[$k];
            // Delete the save
            unset($tmp);
        }

        // If instructed to append, do that
        if ($append) $array[] = $param[1];
        // Otherwise, just set the value
        else $array = $param[1];

        // Destroy the ref for good
        unset($array);
    }

    // Return the result
    return $out;
}

我试图正确处理多级密钥。代码有点笨拙,但它应该可以工作。我试图评论代码,如果您有任何问题,请评论。

测试用例:

var_dump(decode_qstr("ProceduresCustomer.tipi_id=10&ProceduresCustomer.id=1"));
// array(2) {
//   ["ProceduresCustomer.tipi_id"]=>
//   string(2) "10"
//   ["ProceduresCustomer.id"]=>
//   string(1) "1"
// }


var_dump(decode_qstr("ProceduresCustomer.tipi_id=10&ProceduresCustomer.id=1", true));
// array(1) {
//   ["ProceduresCustomer"]=>
//   array(2) {
//     ["tipi_id"]=>
//     string(2) "10"
//     ["id"]=>
//     string(1) "1"
//   }
// }
0赞 Roemer 1/22/2022 #4

我也想添加我的解决方案,因为我很难找到一个既能满足我所有需求又能处理所有情况的解决方案。我测试得非常彻底。它保留了点和空格以及不匹配的方括号(通常更改为下划线),并且它很好地处理了输入中的数组。在 PHP 8.0.0 和 8.0.14 中测试。

const periodPlaceholder = 'QQleQPunT';
const spacePlaceholder = 'QQleQSpaTIE';


function parse_str_clean($querystr): array {
    // without the converting of spaces and dots etc to underscores.
    $qquerystr = str_ireplace(['.','%2E','+',' ','%20'], [periodPlaceholder,periodPlaceholder,spacePlaceholder,spacePlaceholder,spacePlaceholder], $querystr);
    $arr = null ; parse_str($qquerystr, $arr);

    sanitizeArr($arr, $querystr);
    return $arr;
}


function sanitizeArr(&$arr, $querystr) {
    foreach($arr as $key=>$val) {
        // restore values to original
        if ( is_string($val)) {
            $newval = str_replace([periodPlaceholder,spacePlaceholder], ["."," "], $val);
            if ( $val != $newval) $arr[$key]=$newval;
        }
    }
    unset($val);
    foreach($arr as $key=>$val) {
        $newkey = str_replace([periodPlaceholder,spacePlaceholder], ["."," "], $key);
        
        if ( str_contains($newkey, '_') ) { 

            // periode of space or [ or ] converted to _. Restore with querystring
            $regex = '/&('.str_replace('_', '[ \.\[\]]', preg_quote($newkey, '/')).')=/';
            $matches = null ;
            if ( preg_match_all($regex, "&".urldecode($querystr), $matches) ) {

                if ( count(array_unique($matches[1])) === 1 && $key != $matches[1][0] ) {
                    $newkey = $matches[1][0] ;
                }
            }
        }
        if ( $newkey != $key ) $arr = array_replace_key($arr,$key, $newkey);

        if ( is_array($val)) {
            sanitizeArr($arr[$newkey], $querystr);
        }
    }
}


function array_replace_key($array, $oldKey, $newKey): array {
    // preserves order of the array
    if( ! array_key_exists( $oldKey, $array ) )   return $array;
    $keys = array_keys( $array );
    $keys[ array_search( $oldKey, $keys ) ] = $newKey;
    return array_combine( $keys, $array );
}
  • 首先替换空格和 .在解析之前编码之前,通过 querystring 中的占位符,稍后在数组键和值中撤消它。这样我们就可以使用普通parse_str。
  • 不匹配的 [ 和 ] 也会被下划线替换为 parse_str,但这些不能可靠地替换为占位符。而且我们绝对不想替换匹配的 []。因此,我们不替换 [ 和 ],让它们被下划线替换为 parse_str。然后,我们在生成的键中恢复 _,并在原始查询字符串中查看那里是否有 [ 或 ]。
  • 已知错误:键“something]something”和几乎相同的“something[something”可能会混淆。它的发生率为零,所以我离开了它。

测试:

var_dump(parse_str_clean("code.1=printr%28hahaha&code 1=448044&test.mijn%5B%5D%5B2%5D=test%20Roemer&test%20mijn%5B=test%202e%20Roemer"));

产量正确

array(4) {
  ["code.1"]=>
  string(13) "printr(hahaha"
  ["code 1"]=>
  string(6) "448044"
  ["test.mijn"]=>
  array(1) {
    [0]=>
    array(1) {
      [2]=>
      string(11) "test Roemer"
    }
  }
  ["test[mijn"]=>
  string(14) "test 2e Roemer"
}

而原始parse_str仅产生相同的字符串:

array(2) {
  ["code_1"]=>
  string(6) "448044"
  ["test_mijn"]=>
  string(14) "test 2e Roemer"
}