diff options
author | Julien Fiegehenn <simbabque@cpan.org> | 2022-06-02 22:00:55 +0100 |
---|---|---|
committer | Olaf Alders <olaf@wundersolutions.com> | 2022-06-06 15:41:04 -0400 |
commit | 6d3773e8f7fa10033c693ebf276ce8e70c478b48 (patch) | |
tree | 95e90777ee74aab3c522b84574fbd8b42b70658f | |
parent | 8dc77be4df6cbde2c91a51206d51fd611003452b (diff) |
various POD fixes
-rw-r--r-- | Changes | 2 | ||||
-rw-r--r-- | lib/WWW/Mechanize.pm | 171 | ||||
-rw-r--r-- | lib/WWW/Mechanize/FAQ.pod | 7 | ||||
-rwxr-xr-x | script/mech-dump | 2 |
4 files changed, 99 insertions, 83 deletions
@@ -1,6 +1,8 @@ Revision history for WWW::Mechanize {{$NEXT}} + [DOCUMENTATION] + - Various POD fixes (Julien Fiegehenn) 2.08 2022-05-30 17:32:33Z [TESTS] diff --git a/lib/WWW/Mechanize.pm b/lib/WWW/Mechanize.pm index b0eb0a9..b7e02d6 100644 --- a/lib/WWW/Mechanize.pm +++ b/lib/WWW/Mechanize.pm @@ -90,7 +90,8 @@ traverse. $mech->back(); If you want finer control over your page fetching, you can use -these methods. C<follow_link> and C<submit_form> are just high +these methods. C<L<< follow_link()|/"$mech->follow_link(...)" >>> +and C<L<< submit_form()|/"$mech->submit_form( ... )" >>> are just high level wrappers around them. $mech->find_link( n => $number ); @@ -248,7 +249,7 @@ strict and verbose mode for form handling, which is done with L<HTML::Form>. Globally sets the HTML::Form strict flag which causes form submission to croak if any of the passed fields don't exist in the form, and/or a value doesn't exist in a select element. This can still be disabled in individual -calls to C<L<< submit_form()|"$mech->submit_form( ... )" >>>. +calls to C<L<< submit_form()|/"$mech->submit_form( ... )" >>>. Default is off. @@ -335,7 +336,7 @@ sub _agent { "WWW-Mechanize/$VERSION" } =head2 $mech->agent_alias( $alias ) Sets the user agent string to the expanded version from a table of actual user strings. -I<$alias> can be one of the following: +C<$alias> can be one of the following: =over 4 @@ -361,7 +362,9 @@ sets your User-Agent to Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) -The list of valid aliases can be returned from C<known_agent_aliases()>. The current list is: +The list of valid aliases can be returned from +C<L<< known_agent_aliases()|/"$mech->known_agent_aliases()" >>>. +The current list is: =over @@ -403,9 +406,14 @@ sub agent_alias { } } -=head2 known_agent_aliases() +=head2 $mech->known_agent_aliases() Returns a list of all the agent aliases that Mech knows about. +This can also be called as a package or class method. + + @aliases = WWW::Mechanize::known_agent_aliases(); + @aliases = WWW::Mechanize->known_agent_aliases(); + @aliases = $mech->known_agent_aliases(); =cut @@ -419,7 +427,7 @@ sub known_agent_aliases { =head2 $mech->get( $uri ) Given a URL/URI, fetches it. Returns an L<HTTP::Response> object. -I<$uri> can be a well-formed URL string, a L<URI> object, or a +C<$uri> can be a well-formed URL string, a L<URI> object, or a L<WWW::Mechanize::Link> object. The results are stored internally in the agent object, but you don't @@ -476,8 +484,8 @@ sub get { =head2 $mech->post( $uri, content => $content ) -POSTs I<$content> to I<$uri>. Returns an L<HTTP::Response> object. -I<$uri> can be a well-formed URI string, a L<URI> object, or a +POSTs C<$content> to C<$uri>. Returns an L<HTTP::Response> object. +C<$uri> can be a well-formed URI string, a L<URI> object, or a L<WWW::Mechanize::Link> object. =cut @@ -499,8 +507,8 @@ sub post { =head2 $mech->put( $uri, content => $content ) -PUTs I<$content> to I<$uri>. Returns an L<HTTP::Response> object. -I<$uri> can be a well-formed URI string, a L<URI> object, or a +PUTs C<$content> to C<$uri>. Returns an L<HTTP::Response> object. +C<$uri> can be a well-formed URI string, a L<URI> object, or a L<WWW::Mechanize::Link> object. my $res = $mech->head( $uri ); @@ -534,8 +542,8 @@ sub _SUPER_put { =head2 $mech->head ($uri ) -Performs a HEAD request to I<$uri>. Returns an L<HTTP::Response> object. -I<$uri> can be a well-formed URI string, a L<URI> object, or a +Performs a HEAD request to C<$uri>. Returns an L<HTTP::Response> object. +C<$uri> can be a well-formed URI string, a L<URI> object, or a L<WWW::Mechanize::Link> object. =cut @@ -653,7 +661,7 @@ sub history_count { This returns the I<n>th item in history. The 0th item is the most recent request and response, which would be acted on by methods like -C<L<< find_link()|"$mech->find_link( ... )" >>>. +C<L<< find_link()|/"$mech->find_link( ... )" >>>. The 1st item is the state you'd return to if you called C<L<< back()|/$mech->back() >>>. @@ -709,7 +717,7 @@ to the URI itself. Return the current response as an L<HTTP::Response> object. -Synonym for C<< $mech->response() >> +Synonym for C<< $mech->response() >>. =head2 $mech->status() @@ -810,17 +818,17 @@ sub redirects { Returns the content that the mech uses internally for the last page fetched. Ordinarily this is the same as C<< $mech->response()->decoded_content() >>, -but this may differ for HTML documents if L<< update_html|/$mech->update_html( $html ) >> is +but this may differ for HTML documents if C<L<< update_html|/"$mech->update_html( $html )" >>> is overloaded (in which case the value passed to the base-class implementation of same will be returned), and/or extra named arguments -are passed to I<content()>: +are passed to C<content()>: =over 2 =item I<< $mech->content( format => 'text' ) >> Returns a text-only version of the page, with all HTML markup -stripped. This feature requires I<HTML::TreeBuilder> version 5 or higher +stripped. This feature requires L<HTML::TreeBuilder> version 5 or higher to be installed, or a fatal error will be thrown. This works only if the contents are HTML. @@ -828,7 +836,7 @@ the contents are HTML. Returns the HTML document, modified to contain a C<< <base href="$base_href"> >> mark-up in the header. -I<$base_href> is C<< $mech->base() >> if not specified. This is +C<$base_href> is C<< $mech->base() >> if not specified. This is handy to pass the HTML to e.g. L<HTML::Display>. This works only if the contents are HTML. @@ -955,7 +963,7 @@ sub links { =head2 $mech->follow_link(...) Follows a specified link on the page. You specify the match to be -found using the same params that C<L<< find_link()|"$mech->find_link( ... )" >>> uses. +found using the same params that C<L<< find_link()|/"$mech->find_link( ... )" >>> uses. Here some examples: @@ -1021,12 +1029,13 @@ sub follow_link { Finds a link in the currently fetched page. It returns a L<WWW::Mechanize::Link> object which describes the link. (You'll -probably be most interested in the C<url()> property.) If it fails -to find a link it returns undef. +probably be most interested in the +C<L<< url()|"WWW::Mechanize::Link/$link->url()" >>> property.) +If it fails to find a link it returns C<undef>. You can take the URL part and pass it to the C<get()> method. If -that's your plan, you might as well use the C<follow_link()> method -directly, since it does the C<get()> for you automatically. +that's your plan, you might as well use the C<L<< follow_link()|/"$mech->follow_link(...)" >>> +method directly, since it does the C<get()> for you automatically. Note that C<< <FRAME SRC="..."> >> tags are parsed out of the HTML and treated as links so this method works with them. @@ -1107,7 +1116,8 @@ text of "News" and with "cnn.com" in the URL, use: $mech->find_link( text => 'News', url_regex => qr/cnn\.com/ ); The return value is a reference to an array containing a -L<WWW::Mechanize::Link> object for every link in C<< $self->content >>. +L<WWW::Mechanize::Link> object for every link in +C<L<< $self->content|/"$mech->content(...)" >>>. The links come from the following: @@ -1240,7 +1250,7 @@ sub _clean_keys { Returns all the links on the current page that match the criteria. The method for specifying link criteria is the same as in -C<L<< find_link()|"$mech->find_link( ... )" >>>. +C<L<< find_link()|/"$mech->find_link( ... )" >>>. Each of the links returned is a L<WWW::Mechanize::Link> object. In list context, C<find_all_links()> returns a list of the links. @@ -1258,7 +1268,7 @@ sub find_all_links { =head2 $mech->find_all_inputs( ... criteria ... ) -find_all_inputs() returns an array of all the input controls in the +C<find_all_inputs()> returns an array of all the input controls in the current form whose properties match all of the regexes passed in. The controls returned are all descended from HTML::Form::Input. See L<HTML::Form/INPUTS> for details. @@ -1312,7 +1322,8 @@ sub find_all_inputs { =head2 $mech->find_all_submits( ... criteria ... ) -C<find_all_submits()> does the same thing as C<find_all_inputs()> +C<find_all_submits()> does the same thing as +C<L<< find_all_inputs()|/"$mech->find_all_inputs( ... criteria ... )" >>> except that it only returns controls that are submit controls, ignoring other types of input controls like text and checkboxes. @@ -1442,7 +1453,8 @@ ALT text of "News" and with "cnn.com" in the URL, use: $mech->find_image( image => 'News', url_regex => qr/cnn\.com/ ); The return value is a reference to an array containing a -L<WWW::Mechanize::Image> object for every image in C<< $self->content >>. +L<WWW::Mechanize::Image> object for every image in +C<L<< $mech->content|/"$mech->content(...)" >>>. =cut @@ -1509,7 +1521,7 @@ sub _match_any_image_params { Returns all the images on the current page that match the criteria. The method for specifying image criteria is the same as in -C<L<< find_image()|"$mech->find_image()" >>>. +C<L<< find_image()|/"$mech->find_image()" >>>. Each of the images returned is a L<WWW::Mechanize::Image> object. In list context, C<find_all_images()> returns a list of the images. @@ -1560,14 +1572,14 @@ sub current_form { =head2 $mech->form_number($number) Selects the I<number>th form on the page as the target for subsequent -calls to C<L<< field()|"$mech->field( $name, $value, $number )" >>> -and C<L<< click()|"$mech->click( $button [, $x, $y] )" >>>. +calls to C<L<< field()|/"$mech->field( $name, $value, $number )" >>> +and C<L<< click()|/"$mech->click( $button [, $x, $y] )" >>>. Also returns the form that was selected. If it is found, the form is returned as an L<HTML::Form> object and set internally for later use with Mech's form methods such as -C<L<< field()|"$mech->field( $name, $value, $number )" >>> and -C<L<< click()|"$mech->click( $button [, $x, $y] )" >>>. +C<L<< field()|/"$mech->field( $name, $value, $number )" >>> and +C<L<< click()|/"$mech->click( $button [, $x, $y] )" >>>. When called in a list context, the number of the found form is also returned as a second value. @@ -1600,8 +1612,8 @@ generated. If it is found, the form is returned as an L<HTML::Form> object and set internally for later use with Mech's form methods such as -C<L<< field()|"$mech->field( $name, $value, $number )" >>> and -C<L<< click()|"$mech->click( $button [, $x, $y] )" >>>. +C<L<< field()|/"$mech->field( $name, $value, $number )" >>> and +C<L<< click()|/"$mech->click( $button [, $x, $y] )" >>>. Returns undef if no form is found. @@ -1619,8 +1631,8 @@ with that ID, then the first one is used, and a warning is generated. If it is found, the form is returned as an L<HTML::Form> object and set internally for later use with Mech's form methods such as -C<L<< field()|"$mech->field( $name, $value, $number )" >>> and -C<L<< click()|"$mech->click( $button [, $x, $y] )" >>>. +C<L<< field()|/"$mech->field( $name, $value, $number )" >>> and +C<L<< click()|/"$mech->click( $button [, $x, $y] )" >>>. If no form is found it returns C<undef>. This will also trigger a warning, unless C<quiet> is enabled. @@ -1665,8 +1677,8 @@ and a warning is generated. If it is found, the form is returned as an L<HTML::Form> object and set internally for later used with Mech's form methods such as -C<L<< field()|"$mech->field( $name, $value, $number )" >>> and -C<L<< click()|"$mech->click( $button [, $x, $y] )" >>>. +C<L<< field()|/"$mech->field( $name, $value, $number )" >>> and +C<L<< click()|/"$mech->click( $button [, $x, $y] )" >>>. Returns undef and emits a warning if no form is found. @@ -1727,8 +1739,8 @@ Using C<undef> as value means that the attribute in question must not be present If it is found, the form is returned as an L<HTML::Form> object and set internally for later used with Mech's form methods such as -C<L<< field()|"$mech->field( $name, $value, $number )" >>> and -C<L<< click()|"$mech->click( $button [, $x, $y] )" >>>. +C<L<< field()|/"$mech->field( $name, $value, $number )" >>> and +C<L<< click()|/"$mech->click( $button [, $x, $y] )" >>>. Returns undef if no form is found. @@ -1791,11 +1803,11 @@ These methods allow you to set the values of fields in a given form. Given the name of a field, set its value to the value specified. This applies to the current form (as set by the -C<L<< form_name()|"$mech->form_name( $name )" >>> or -C<L<< form_number()|"$mech->form_number($number)" >>> +C<L<< form_name()|/"$mech->form_name( $name )" >>> or +C<L<< form_number()|/"$mech->form_number($number)" >>> method or defaulting to the first form on the page). -The optional I<$number> parameter is used to distinguish between two fields +The optional C<$number> parameter is used to distinguish between two fields with the same name. The fields are numbered from 1. =cut @@ -2073,7 +2085,7 @@ sub untick { Given the name of a field, return its value. This applies to the current form. -The optional I<$number> parameter is used to distinguish between two fields +The optional C<$number> parameter is used to distinguish between two fields with the same name. The fields are numbered from 1. If the field is of type file (file upload field), the value is always @@ -2146,7 +2158,7 @@ Clicks the button with the value I<value> in the current form. =item * C<< input => $inputobject >> -Clicks on the button referenced by $inputobject, an instance of +Clicks on the button referenced by C<$inputobject>, an instance of L<HTML::Form::SubmitInput> obtained e.g. from $mech->current_form()->find_input( undef, 'submit' ) @@ -2264,8 +2276,8 @@ and data setting in one operation. It selects the first form that contains all fields mentioned in C<\%fields>. This is nice because you don't need to know the name or number of the form to do this. -(calls C<L<< form_with_fields()|"$mech->form_with_fields( @fields )" >>> and - C<L<< set_fields()|"$mech->set_fields( $name => $value ... )" >>>). +(calls C<L<< form_with_fields()|/"$mech->form_with_fields( @fields )" >>> and + C<L<< set_fields()|/"$mech->set_fields( $name => $value ... )" >>>). If you choose C<with_fields>, the C<fields> option will be ignored. The C<form_number>, C<form_name> and C<form_id> options will still be used. An @@ -2275,26 +2287,26 @@ criteria. =item * C<< form_number => n >> Selects the I<n>th form (calls -C<L<< form_number()|"$mech->form_number($number)" >>>. If this param is not +C<L<< form_number()|/"$mech->form_number($number)" >>>. If this param is not specified, the currently-selected form is used. =item * C<< form_name => name >> Selects the form named I<name> (calls -C<L<< form_name()|"$mech->form_name( $name )" >>>) +C<L<< form_name()|/"$mech->form_name( $name )" >>>) =item * C<< form_id => ID >> Selects the form with ID I<ID> (calls -C<L<< form_id()|"$mech->form_id( $name )" >>>) +C<L<< form_id()|/"$mech->form_id( $id )" >>>) =item * C<< button => button >> -Clicks on button I<button> (calls C<L<< click()|"$mech->click( $button [, $x, $y] )" >>>) +Clicks on button I<button> (calls C<L<< click()|/"$mech->click( $button [, $x, $y] )" >>>) =item * C<< x => x, y => y >> -Sets the x or y values for C<L<< click()|"$mech->click( $button [, $x, $y] )" >>> +Sets the x or y values for C<L<< click()|/"$mech->click( $button [, $x, $y] )" >>> =item * C<< strict_forms => bool >> @@ -2310,7 +2322,7 @@ by passing C<< strict_forms => 0 >> here. If no form is selected, the first form found is used. -If I<button> is not passed, then the C<L<< submit()|"$mech->submit()" >>> +If I<button> is not passed, then the C<L<< submit()|/"$mech->submit()" >>> method is used instead. If you want to submit a file and get its content from a scalar rather @@ -2503,7 +2515,7 @@ sub quiet { Get or set the page stack depth. Use this if you're doing a lot of page scraping and running out of memory. -A value of 0 means "no history at all." By default, the max stack depth +A value of C<0> means "no history at all." By default, the max stack depth is humongously large, effectively keeping all history. =cut @@ -2516,10 +2528,10 @@ sub stack_depth { =head2 $mech->save_content( $filename, %opts ) -Dumps the contents of C<< $mech->content >> into I<$filename>. -I<$filename> will be overwritten. Dies if there are any errors. +Dumps the contents of C<L<< $mech->content|/"$mech->content(...)" >>> into C<$filename>. +C<$filename> will be overwritten. Dies if there are any errors. -If the content type does not begin with "text/", then the content +If the content type does not begin with C<"text/">, then the content is saved in binary mode (i.e. C<binmode()> is set on the output filehandle). @@ -2539,7 +2551,7 @@ This I<should> be the safest way to save contents verbatim. =item I<< $mech->save_content( $filename, binmode => $binmode ) >> -Filehandle is set to binary mode. If C<$binmode> begins with ':', it is +Filehandle is set to binary mode. If C<$binmode> begins with C<':'>, it is passed as a parameter to C<binmode>: binmode $fh, $binmode; @@ -2550,7 +2562,7 @@ otherwise the filehandle is set to binary mode if C<$binmode> is true: =item I<all other arguments> -are passed as-is to C<< $mech->content(%opts) >>. In particular, +are passed as-is to C<L<< $mech->content(%opts)|/"$mech->content(...)" >>>. In particular, C<decoded_by_headers> might come handy if you want to revert the effect of line compression performed by the web server but without further interpreting the contents (e.g. decoding it according to the charset). @@ -2587,10 +2599,10 @@ sub save_content { =head2 $mech->dump_headers( [$fh] ) Prints a dump of the HTTP response headers for the most recent -response. If I<$fh> is not specified or is undef, it dumps to +response. If C<$fh> is not specified or is C<undef>, it dumps to STDOUT. -Unlike the rest of the dump_* methods, $fh can be a scalar. It +Unlike the rest of the C<dump_*> methods, C<$fh> can be a scalar. It will be used as a file name. =cut @@ -2620,10 +2632,10 @@ sub dump_headers { =head2 $mech->dump_links( [[$fh], $absolute] ) -Prints a dump of the links on the current page to I<$fh>. If I<$fh> -is not specified or is undef, it dumps to STDOUT. +Prints a dump of the links on the current page to C<$fh>. If C<$fh> +is not specified or is C<undef>, it dumps to STDOUT. -If I<$absolute> is true, links displayed are absolute, not relative. +If C<$absolute> is true, links displayed are absolute, not relative. =cut @@ -2642,13 +2654,13 @@ sub dump_links { =head2 $mech->dump_images( [[$fh], $absolute] ) -Prints a dump of the images on the current page to I<$fh>. If I<$fh> -is not specified or is undef, it dumps to STDOUT. +Prints a dump of the images on the current page to C<$fh>. If C<$fh> +is not specified or is C<undef>, it dumps to STDOUT. -If I<$absolute> is true, links displayed are absolute, not relative. +If C<$absolute> is true, links displayed are absolute, not relative. The output will include empty lines for images that have no C<src> attribute -and therefore no C<<->url>>. +and therefore no URL. =cut @@ -2667,8 +2679,8 @@ sub dump_images { =head2 $mech->dump_forms( [$fh] ) -Prints a dump of the forms on the current page to I<$fh>. If I<$fh> -is not specified or is undef, it dumps to STDOUT. Running the following: +Prints a dump of the forms on the current page to C<$fh>. If C<$fh> +is not specified or is C<undef>, it dumps to STDOUT. Running the following: my $mech = WWW::Mechanize->new(); $mech->get("https://www.google.com/"); @@ -2701,8 +2713,8 @@ sub dump_forms { =head2 $mech->dump_text( [$fh] ) -Prints a dump of the text on the current page to I<$fh>. If I<$fh> -is not specified or is undef, it dumps to STDOUT. +Prints a dump of the text on the current page to C<$fh>. If C<$fh> +is not specified or is C<undef>, it dumps to STDOUT. =cut @@ -2805,7 +2817,7 @@ update it so the links come out correctly: This method is also used internally by the mech itself to update its own HTML content when loading a page. This means that if you would like to I<systematically> perform the above HTML substitution, you -would overload I<update_html> in a subclass thusly: +would overload C<update_html> in a subclass thusly: package MyMech; use base 'WWW::Mechanize'; @@ -2818,11 +2830,12 @@ would overload I<update_html> in a subclass thusly: If you do this, then the mech will use the tidied-up HTML instead of the original both when parsing for its own needs, and for returning to -you through C<L<< content()|"$mech->content(...)" >>>. +you through C<L<< content()|/"$mech->content(...)" >>>. Overloading this method is also the recommended way of implementing extra validation steps (e.g. link checkers) for every HTML page -received. L</warn> and L</die> would then come in handy to signal +received. C<L<< warn|/"warn( @messages )" >>> and +C<L<< warn|/"warn( @messages )" >>> would then come in handy to signal validation errors. =cut @@ -2928,10 +2941,10 @@ know about them. =head2 $mech->_update_page($request, $response) -Updates all internal variables in $mech as if $request was just +Updates all internal variables in C<$mech> as if C<$request> was just performed, and returns $response. The page stack is B<not> altered by this method, it is up to caller (e.g. -C<L<< request|"$mech->request( $request [, $arg [, $size]])" >>>) +C<L<< request|/"$mech->request( $request [, $arg [, $size]])" >>>) to do that. =cut @@ -3567,7 +3580,7 @@ listings. Randal Schwartz's article on scraping Yahoo News for images. It's already out of date: He manually walks the list of links hunting for matches, which wouldn't have been necessary if the -C<L<< find_link()|"$mech->find_link( ... )" >>> method existed at press time. +C<L<< find_link()|/"$mech->find_link( ... )" >>> method existed at press time. =item * L<http://www.perladvent.org/2002/16th/> diff --git a/lib/WWW/Mechanize/FAQ.pod b/lib/WWW/Mechanize/FAQ.pod index b799971..7985d6e 100644 --- a/lib/WWW/Mechanize/FAQ.pod +++ b/lib/WWW/Mechanize/FAQ.pod @@ -85,7 +85,8 @@ can be used to add the headers that you need. =head2 Which modules work like Mechanize and have JavaScript support? In no particular order: L<Gtk2::WebKit::Mechanize>, L<Win32::IE::Mechanize>, -L<WWW::Mechanize::Firefox>, L<WWW::Scripter>, L<WWW::Selenium> +L<WWW::Mechanize::Firefox>, L<WWW::Mechanize::Chrome>, L<WWW::Scripter>, +L<WWW::Selenium> =head1 How do I do X? @@ -164,8 +165,8 @@ them with =head2 How do I get a list of HTTP headers and their values? All L<HTTP::Headers> methods work on a L<HTTP::Response> object which is -returned by the I<get()>, I<reload()>, I<response()/res()>, I<click()>, -I<submit_form()>, and I<request()> methods. +returned by the C<get()>, C<reload()>, C<response()/res()>, C<click()>, +C<submit_form()>, and C<request()> methods. my $mech = WWW::Mechanize->new( autocheck => 1 ); $mech->get( 'http://my.site.com' ); diff --git a/script/mech-dump b/script/mech-dump index 0771efe..c36832c 100755 --- a/script/mech-dump +++ b/script/mech-dump @@ -73,7 +73,7 @@ The order of the options specified is relevant. Repeated options get repeated dumps. Proxy settings are specified through the environment (e.g. C<http_proxy=http://proxy.my.place/>). -See LWP::UserAgent for details. +See L<LWP::UserAgent> for details. =cut |