0

I'd like to capture text from a website (for personal usage, not business) but unfortunately they have disabled copy-paste, printing and viewing the source.

Is there another way of capturing the text?

3 Answers 3

1

Is there another way of capturing the text?

Likely so, but the methods you may want to consider using likely depend on several factors:

  • What is the intended use of this text? (does it need to be editable or are you interested in simply reading it)?

  • What type of text is it? (i.e. is it "normal" text or e.g. a .pdf or another type of document?)

  • Is the text produced by JavaScript?

  • What security measures has the site implemented to protect this text?

  • How much effort are are you willing to expend to copy the text?

Images

As hinted at in the answer by @John, producing an image of the text would be one simple way to obtain a copy.

Personally, I wouldn't choose a smart phone for copying a website, however. Most browsers have the ability to be maximized and there are options for taking screenshots (pictures of program windows or your desktop) for most operating systems.

One drawback to this method might be the need to edit or stitch images together to obtain complete documents and/or the need to extract text with optical character recognition (OCR). There are free programs available that will allow you to do this, but this is obviously more complex than simple "cut and paste" operations.

Source Code

You can (potentially) get the source code of the web page by saving the web page from the browser (as HTML) or by using another program (e.g. curl or wget) to download the page. However, this may not always capture the desired text. If the text is rendered by JavaScript, or is an embedded document, it may be loaded/rendered by the browser as a secondary operation after the "basic" page source is loaded.

Removing The Blocking Code

Unfortunately they have disabled copy-paste, printing and viewing the source.

This is almost certainly done via JavaScript. It might be possible to use a browser plugin to run custom JavaScript (a "user script") on the page to modify the page and remove the blocking JavaScript, so the page behaves normally. That said, this would probably depend on any "user script" running first, as well as finding or creating a script able to remove the offending code in the first place.

Browser Automation

Modern Chromium-based browsers (e.g. Chrome, Chromium, Firefox) can be used with scripting languages such as JavaScript or Python (my personal preference) to manipulate them. Importantly, they can (in certain instances) extract text from the page.

0

I'd like to capture text from a website .... they have disabled copy-paste, printing and viewing the source.

Assuming the use to be personal and legal, one practical choice is to capture it on a smart phone camera. That applies to anything you can see (read).

This applies on any such device where (for whatever reason) copy / paste is not available. I have done this for BIOS and like screens.

If you are doing this on Windows 10, WinKey + Shift + S is good at getting screen copies and might works as well.

0

You can use JavaScript to overwrite the value in the source code that prohibits text selection. Alan Hogan has written a neat JavaScript Bookmarklet that does exactly that, you can find it here. I've been using it for almost two years now. mostly to copy names and chat messages from Zoom calls, as there you are also not able to do it normally.

All you need to do is to paste the following JS code as a bookmark in your browser and click it on the page you want to select and copy text:

javascript:
(function(){
  function allowTextSelection(){
    window.console&&console.log('allowTextSelection');
    var style=document.createElement('style');
    style.type='text/css';
    style.innerHTML='*,p,div{user-select:text%20!important;-moz-user-select:text%20!important;-webkit-user-select:text%20!important;}';
    document.head.appendChild(style);
    var elArray=document.body.getElementsByTagName('*');
    for(var i=0;i<elArray.length;i++){
      var el=elArray[i];
      el.onselectstart=el.ondragstart=el.ondrag=el.oncontextmenu=el.onmousedown=el.onmouseup=function(){
        return tru
      }
      ;
      if(el instanceof HTMLInputElement&&['text','password','email','number','tel','url'].indexOf(el.type.toLowerCase())>-1){
        el.removeAttribute('disabled');
        el.onkeydown=el.onkeyup=function(){
          return tru
        }
        ;
      }
    }
  }
  allowTextSelection();

Or you just go to the site linked above and drag and drop the bookmarklet to your bookmarks bar, that is much easier.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .