PHP Text Highlighter

This snippet accepts an array of words, and a descriptive name which can be used to reference the words, like so:


$this->scan(array('mysql','css','js','content management system',
  'xhtml','xml','rss','html','lamp','ria',
  'sugar','php','zend framework','oo','dojo',
  'doctrine','javascript','smarty','ez publish'),'skills');

It scans a block of text, and returns matches enclosed in span tags, which can be highlighted using CSS.


private function scan($aWords,$sType)
{
  // Thanks to: http://us.php.net/manual/en/function.array-walk.php#103056
  array_walk($aWords, create_function('&$val', '$val = \'/(\W)(\'.$val.\')\W)/umi\';'));
  $this->data[$sType.'_HTML']=preg_replace($aWords,'\1<span class="found">\2</span>\3',
    $this->data['sContent'],1,$this->data[$sType.'_COUNT']);
}

// Used to access the private $data array 
public function __get($sName)
{
  if(isset($this->data[$sName]))
    return $this->data[$sName];
  else
    return null;
}

To extract the HTML and a count of the matches:


$sOutput=<<<BLOCK
<div id="skills_check">
$oResponse->skills_HTML
<div id="skills_count">
  Skills matched: $oResponse->skills_COUNT
</div>
BLOCK;

A match is considered a string with non-word characters immediately before and after. This makes mismatches less likely. For example, ‘oo’ would not match ‘Joomla’, but it would match ‘ oo.’, or ‘,oo and …’.

There can be several scans in a page, using the id of a parent div can allow them to be presented differently, for example:

span.found
{
font-weight:bolder;
}
#skills_check span.found
{
color:#22ff22;
}

Zend Framework - No translation for the language 'en_US' available.

Notice: No translation for the language ‘en_US’ available. in /var/www/html/ZendFramework-1.10.8-minimal/library/Zend/Translate/Adapter.php on line 441

I’ve been working on an application using Zend Framework as the foundation. One of the key elements is ensuring internationalization/i18n support, and the error above was being displayed.

In addition to translating page text, I wanted to add htmlentities
conversion, without calling it explicitly.

I created an adapter which extends the Gettext adapter.


<?php

Class CLOUD_Translate_Adapter_Gettext Extends Zend_Translate_Adapter_Gettext
{
    public function translate($messageId, $locale = null)
    {
		return htmlentities(parent::translate($messageId,$locale),ENT_QUOTES,'UTF-8');
	}
}

To prevent the error from being displayed, I changed:

        
$translate = new Zend_Translate(array('adapter'=>'CLOUD_Translate_Adapter_Gettext',
                'content'=>$language_path,
                'scan' => Zend_Translate::LOCALE_DIRECTORY,
                'locale'=>$locale->toString()));
$translate->setOptions(array('disableNotices'=>true));

to:


$translate = new Zend_Translate(array('adapter'=>'CLOUD_Translate_Adapter_Gettext',
                'content'=>$language_path,
                'scan' => Zend_Translate::LOCALE_DIRECTORY,
                'locale'=>$locale->toString(),
                'disableNotices'=>true));

The difference is that disableNotices is included in the instantiation, so that as it initializes the translation object, detected errors are not reported.

Since the default language for the page is en-US, there is no need for translation.

Key Cloud Service Architecture Considerations

Converting an existing product or service for use in the cloud should begin with a consideration of several key requirements. Simply extending a system to support more clients may result in serious security, performance, and scalability issues.

These core requirements should be addressed prior to the application specific features, they represent the foundation for the system. These common elements apply for virtually any cloud based service.

  • Secure - The system must be designed and developed with best practices for security.
  • Scalable - The system must be scalable, to allow additional organizations to join without impacting others. The system capacity must be able to grow as necessary.
  • Manageable - It must be possible to adjust the resource allocation in response to changing consumption. If one server is too busy, clients should be able to be moved to a different server, with minimal service impact.
  • Fault Tolerant - System outages should be minimized and a disaster recovery plan must be in place. This includes backup and restore procedures, whether the failure or problem is due to a hardware, software, administrator, or user issue.
  • Segmented - Each client or organization must have a separate file system area, or account, and database, and access to these resources should limited as necessary for the use of the data. Failure to organize and protect the data can make it difficult or impossible to secure, scale and manage. Although some service elements may be system or server based, they must be provisioned or partitioned for use in such a way that the data is segmented.
  • Access Paths - Cloud services should allow subscribers to white label, so they can offer the services as if they are their own. Access URLs should be defined by the subscribers, probably as subdomains or derivative domains from their existing architecture.
  • Brandable - It is important that the services be brandable, so clients can identify the system as theirs. The extent of branding can vary, but at a minimum, a corporate logo should be displayed, and some connections or links to the corporate site and support resources should be provided. A neutral color scheme, such as grayscale can make it easier to ensure a nice presentation regardless of the logo.
  • Incremental - A tiered service offering should be established to allow free trials, as well as service levels so clients can purchase only the services they need.
  • Extensible - Cloud offerings should include the ability to extend and integrate with other systems, and those extensions should be managed at the client level. Thus, if one subscriber needs custom code, the code can be added without affecting other clients.
  • Integration - As with extensions, custom integration should be enabled to allow access into client infrastructures without affecting other clients.
  • Turnkey - The base system should be available very quickly, set up time should be minimal.
  • Easy to Use - Examine any existing systems and consider support requests to identify areas that have been difficult for clients to work with. Be sure to consider ways to improve these issues before proceeding.
  • Web Service or API Access - Subscribers should be able to interact with the system through automated processes with standards based interfaces such as SOAP and REST. These interfaces must be well-defined and documented. Test code and interfaces should be provided to ease development.
  • Well Supported - An extension of easy to use, support refers to ensuring users can use the system to do what needs to be done in a timely manner. The application needs to be organized carefully, supported with validation and help on the client (browser) side, adequate documentation, meaning a full manual in PDF, HTML or .zipped, a ticket system and possible a live chat interface.
  • Cost Effective - The cost of the system must be competitive.
  • Service Level Agreement - A service level agreement must be provided. This ensures both the provider and client understand exactly what is being offered.
  • Localization (l10n)
  • Internationalization (i18n)

Word 2007 VBA Macro to Create PDFs for a Manual

This is a macro that reads a (Microsoft) Word 2007 document which uses INCLUDETEXT fields to draw in content from other documents and exports the content as a PDF, including a table of contents and index.

There are three administrator types or roles - System, Partner, and Limited. For each chapter, an INCLUDETEXT field includes the content for that role, for example:

{INCLUDETEXT "{BaseDir}//Overview//{AdminRole}.docx"}

The document for each role has the appropriate content. In most cases, the directory has a Common.docx file that has the text for all roles, and those authorized to view it include the file like so:

{INCLUDETEXT "{BaseDir}//Overview//Common.docx"}

Unauthorized users would have an empty .docx file, or one with a limited version of content.

The way the macro works is to first extract the value of BaseDir, which allows an absolute path to be used, but modified without updating all the files. Relative paths just didn’t work well.

The document also uses some bookmarks, which the macro uses to reference different sections. One bookmark includes all the content from both the chapters and appendices. The chapters are enclosed in a bookmark called ‘Chapter’, and the appendices are enclosed in a bookmark called ‘Appendix’. These bookmarks are used to create the footers, since the chapters are numeric and the appendices are alphabetic. If content for the chapter is included for the role, a section break is inserted. This ensures that the chapter and page numbers are consequetive, even if chapters are omitted.



Option Base 1
Option Explicit
Global AdminTypes

Sub Init()
    AdminTypes = Array("System", "Partner", "Limited")
End Sub

Sub CreatePDFs()

Dim C, D, I, L, N, T, F, BaseDir, FieldCount
Dim BaseDirFieldItemIndex, AdminTypeFieldItemIndex, CurrentYearFieldItemIndex
Dim rngTemp As Range, rngField As Range
Dim fldPtr As Field
Dim J, JL, JU, arrBookmarks(2) As String
Dim Selection As Range
Dim Footer As Range
Dim S As String, LastFooterType As String

Init
I = 1
AdminTypeFieldItemIndex = 0
BaseDirFieldItemIndex = 0

FieldCount = ActiveDocument.Fields.Count()

For I = 1 To FieldCount
    T = ActiveDocument.Fields.Item(I).Code.Text
    If (InStr(1, T, "BaseDir", vbTextCompare) <> 0) Then
        BaseDirFieldItemIndex = I
    End If
    If (InStr(1, T, "AdminType", vbTextCompare) <> 0) Then
        AdminTypeFieldItemIndex = I
    End If
    If (InStr(1, T, "CurrentYear", vbTextCompare) <> 0) Then
        CurrentYearFieldItemIndex = I
    End If
    If AdminTypeFieldItemIndex <> 0 And BaseDirFieldItemIndex <> 0 And CurrentYearFieldItemIndex <> 0 Then Exit For
Next

' Set the current year for the copyright date
Set rngTemp = ActiveDocument.Fields.Item(CurrentYearFieldItemIndex).Code
rngTemp.Text = " SET CurrentYear " + Chr(34) + Str$(DatePart("yyyy", Date)) + Chr(34) + " "

' Set the base directory for INCLUDETEXT tags
ActiveDocument.Fields.Item(BaseDirFieldItemIndex).Update
D = ActiveDocument.Fields.Item(BaseDirFieldItemIndex).Result()
Set rngTemp = ActiveDocument.Fields.Item(BaseDirFieldItemIndex).Code
rngTemp.Text = " SET BaseDir " + Chr(34) + D + Chr(34) + " "
D = D + Chr(92)

' Setup the loop boundaries for the roles
I = LBound(AdminTypes)
L = UBound(AdminTypes)

' Set up the sections that will be processed
arrBookmarks(1) = "Chapter"
arrBookmarks(2) = "Appendix"
JL = LBound(arrBookmarks)
JU = UBound(arrBookmarks)
LastFooterType = ""

' Loop through all the admin types or roles
For N = I To L

    ' Set the role for this document
    Set rngTemp = ActiveDocument.Fields.Item(AdminTypeFieldItemIndex).Code
    rngTemp.Text = " SET AdminType """ + AdminTypes(N) + """ "
    ActiveDocument.Fields.Item(AdminTypeFieldItemIndex).Update
         
    ' Update all the fields for this role
    Set rngTemp = ActiveDocument.Bookmarks("ChapterBlockStart").Range
    rngTemp.Select
    rngTemp.Fields.Update
    
    ' Loop through the sections that will be processed.
    ' Each included file is checked to see if content was included
    ' Files that have content are followed by a section break, empty files are not
    For J = JL To JU
         
        Set rngTemp = ActiveDocument.Bookmarks(arrBookmarks(J)).Range
        rngTemp.Select
     
        For Each fldPtr In rngTemp.Fields
            ' Loop through all the field in this section or group of included files
            T = fldPtr.Type
            ' If this field is an INCLUDETEXT
            If (T = wdFieldIncludeText) Then
                T = Trim(fldPtr.Result())
                ' If the included text is not empty
                If (T <> "") And (Asc(T) <> 13) Then
                    Set rngField = ActiveDocument.Range
                    rngField.Find.Text = fldPtr.Code
                    ' Search the document for the tag.  This ensures included tags do not add section breaks
                    rngField.Find.Execute
                    If rngField.Find.Found Then
                        ' Select the tag
                        rngField.Select
                        ' Advance the range to the end of the field
                        rngField.MoveEnd wdCharacter, 2
                        rngField.Collapse wdCollapseEnd
                        ' Page numbering starts at 1 for all sections
                        rngField.Sections(1).Footers(wdHeaderFooterPrimary).PageNumbers.StartingNumber = 1
                        rngField.Sections(1).Footers(wdHeaderFooterPrimary).PageNumbers.RestartNumberingAtSection = True
                        If LastFooterType = arrBookmarks(J) Then
                            rngField.Sections(1).Footers(wdHeaderFooterPrimary).LinkToPrevious = True
                        Else
                           rngField.Sections(1).Footers(wdHeaderFooterPrimary).LinkToPrevious = False
                            ' Create new footer
                            Set Footer = rngField.Sections(1).Footers(wdHeaderFooterPrimary).Range
                            Footer.Select
                            ' Clear any existing text
                            Footer.Delete
                            ' Set up the table
                            Footer.Tables.Add Range:=Footer, NumRows:=1, _
                                NumColumns:=2, DefaultTableBehavior:=wdWord9TableBehavior, AutoFitBehavior:=wdAutoFitFixed
                            With Footer.Tables(1)
                                .Borders.Enable = False
                                If .Style <> "Table Grid" Then
                                    .Style = "Table Grid"
                                End If
                                .ApplyStyleHeadingRows = False
                                .ApplyStyleLastRow = False
                                .ApplyStyleFirstColumn = False
                                .ApplyStyleLastColumn = False
                                .ApplyStyleRowBands = False
                                .ApplyStyleColumnBands = False
                            End With
                            ' Left column
                            Set Selection = Footer.Tables(1).Cell(1, 1).Range
                            Selection.Select
                            Selection.Collapse wdCollapseStart
                            Selection.Text = "Mobiso " + AdminTypes(N) + " Administrator's Guide"
                            ' Right column
                            Selection.Start = Footer.Tables(1).Cell(1, 2).Range.Start
                            Selection.Select
                            Selection.Collapse wdCollapseStart
                            Selection.Text = arrBookmarks(J) + " <field> - <page>"
                            Selection.Find.Text = "<field>"
                            If Selection.Find.Execute Then
                                Selection.Select
                                S = "SEQ Chapter \c"
                                If arrBookmarks(J) = "Appendix" Then
                                    S = S + " \* ALPHABETIC"
                                Else
                                    S = S + " \* ARABIC"
                                End If
                                Selection.Fields.Add Range:=Selection, Type:=wdFieldEmpty, Text:=S, PreserveFormatting:=False
                            End If
                            Selection.Find.Text = "<page>"
                            If Selection.Find.Execute Then
                                Selection.Select
                                Selection.Fields.Add Range:=Selection, Type:=wdFieldEmpty, Text:="PAGE", PreserveFormatting:=False
                            End If
                            Footer.Tables(1).Cell(1, 2).Range.ParagraphFormat.Alignment = wdAlignParagraphRight
                            LastFooterType = arrBookmarks(J)
                        End If
                        ' Insert the section break
                        rngField.InsertBreak wdSectionBreakNextPage
                    End If
                End If
            End If
        Next fldPtr
    Next J
           
    Set rngTemp = ActiveDocument.Bookmarks("Appendix").Range
    rngTemp.Select
    rngTemp.Find.Execute FindText:="^b", ReplaceWith:="", Replace:=wdReplaceOne, Forward:=False
           
    ' Update table of contents and index
    ActiveDocument.TablesOfContents.Item(1).Update
    ActiveDocument.Indexes.Item(1).Update
    
    'MsgBox "Exporting " + AdminTypes(N) + " manual to PDF (" + D + AdminTypes(N) + ".pdf)"
    ActiveDocument.ExportAsFixedFormat D + AdminTypes(N) + ".pdf", wdExportFormatPDF, False, wdExportOptimizeForPrint, wdExportAllDocument
    
    ' Remove the inserted section breaks
    For J = JL To JU
        Set rngTemp = ActiveDocument.Bookmarks(arrBookmarks(J)).Range
        rngTemp.Find.Execute FindText:="^b", ReplaceWith:="", Replace:=wdReplaceAll
    Next J
    
    ' Helpful if you want to build one PDF, then check it
    'If (MsgBox("Built " + AdminTypes(N) + " PDF", vbOKCancel, "Continue?") = vbCancel) Then Exit For
      
Next

MsgBox "Done - Updated PDFs are in " + D

End Sub


This code has good examples of the following with VBA:

  • Set a FIELD tag
  • Delete a section break
  • Export a Word document
  • Create a footer
  • Create a table
  • Insert a section break

This post courtesy of http://mobiso.com

Mozilla/4.0 (compatible;)

This user agent was in the middle of many page requests in my Apache logs, requesting content referenced by link tags in the head section.

After a bit of research on one of the link tag URLs, I ran this script:

IPS=`grep Author access_log | cut -f 1 -d ' '  | sort | uniq`
for IP in $IPS
do
        echo Testing "$IP"
        host "$IP"  
done

In almost every case, the requests came from large organizations - corporations, government agencies, and the military.

These institutions often use proxy servers, and Mozilla/4.0 (compatible;) must be a common user agent setting for the proxy server requests.

In the one case where it wasn’t a large organization, it was a blacklisted IP, and the user agent was Java.

The sample set was limited, but the pattern was clear.